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METHODS OF DIAGNOSIS OF PROSTATE CANCER, 
COMPOSITIONS AND METHODS OF SCREENING FOR 
MODULATORS OF PROSTATE CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application claims priority from the following applications: USSN 
09/687,576 filed October 13, 2000, USSN 60/276,791 filed March 16, 2001; USSN 
60/288,589, filed May 4, 2001; USSN 09/733,742, filed December 8, 2000; USSN 
10 09/733,288, filed December 8, 2000; USSN 09/847,046, filed April 30, 2001; USSN 
60/276,888, filed March 16, 2001; USSN 60/286,214, filed April 24, 2001; USSN 
60/281,922, filed April 6, 2001; USSN 60/263,957, filed January 24, 2001, which are 
incorporated herein by reference in their entirety. 

15 FIELD OF THE INVENTION 

The invention relates to the identification of nucleic acid and protein 
expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
prostate cancer, and to the use of such expression profiles and compositions in the diagnosis, 
prognosis and therapy of prostate cancer. The invention further relates to methods for 
20 identifying and using agents and/or targets that inhibit prostate cancer. 



BACKGROUND OF THE INVENTION 
Prostate cancer is the most commonly diagnosed internal malignancy and 
second most common cause of cancer death in men in the U.S., resulting in approximately 
25 40,000 deaths each year ( Landis et al., CA Cancer J. Clin. 48:6-29 (1998); Greenlee et al., 
CA Cancer J. Clin. 50(1):7-13 (2000)), and incidence of prostate cancer has been increasing 
rapidly over the past 20 years in many parts of the world (Nakata et al., Int. J. Urol. 
7(7):254-257 (2000); Majeed et al., BJUInt. 85(9): 1058-1062 (2000)). It develops as the 
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result of a pathologic transformation of normal prostate cells. In tumorigenesis, the cancer 
cell undergoes initiation, proliferation and loss of contact inhibition, culminating in invasion 
of surrounding tissue and, ultimately, metastasis. 

Deaths from prostate cancer are a result of metastasis of a prostate tumor. 
5 Therefore, early detection of the development of prostate cancer is critical in reducing 

mortality from this disease. Measuring levels of prostate-specific antigen (PSA) has become 
a very common method for early detection and screening, and may have contributed to the 
slight decrease in the mortality rate from prostate cancer in recent years (Nowroozi et al., 
Cancer Control 5(6):522-53 1 (1998)). However, many cases are not diagnosed until the 

10 disease has progressed to an advanced stage. 

Treatments such as surgery (prostatectomy) , radiation therapy, and 
cryotherapy are potentially curative when the cancer remains localized to the prostate. 
Therefore, early detection of prostate cancer is important for a positive prognosis for 
treatment. Systemic treatment for metastatic prostate cancer is limited to hormone therapy 

15 and chemotherapy. Chemical or surgical castration has been the primary treatment for 
symptomatic metastatic prostate cancer for over 50 years. This testicular androgen 
deprivation therapy usually results in stabilization or regression of the disease (in 80% of 
patients), but progression of metastatic prostate cancer eventually develops (Panvichian et al., 
Cancer Control 3(6):493-500 (1996)). Metastatic disease is currently considered incurable, 

20 and the primary goals of treatment are to prolong survival and improve quality of life (Rago, 
Cancer Control 5(6):513-521 (1998)). 

Thus, methods that can be used for diagnosis and prognosis of prostate cancer 
and effective treatment of prostate cancer, and including particularly metastatic prostate 
cancer, would be desirable. Accordingly, provided herein are methods that can be used in 

25 diagnosis and prognosis of prostate cancer. Further provided are methods that can be used to 
screen candidate bioactive agents for the ability to modulate, e.g., treat, prostate cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in prostate cancer and other cancers. 
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SUMMARY OF THE INVENTION 
The present invention therefore provides nucleotide sequences of genes that 
are up- and down-regulated in prostate cancer cells. Such genes are useful for diagnostic 
purposes, and also as targets for screening for therapeutic compounds that modulate prostate 
5 cancer, such as hormones or antibodies. Other aspects of the invention will become apparent 
to the skilled artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting a prostate 
cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
10 sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the present invention provides a method of determining 
the level of a prostate cancer associated transcript in a cell from a patient. 

In one embodiment, the present invention provides a method of detecting a 
prostate cancer-associated transcript in a cell from a patient, the method comprising 
15 contacting a biological sample from the patient with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1-16. In another embodiment, the 
polynucleotide comprises a sequence as shown in Tables 1-16. 
20 In one embodiment, the biological sample is a tissue sample. In another 

embodiment, the biological sample comprises isolated nucleic acids, e.g,, mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent 

label. 

In one embodiment, the polynucleotide is immobilized on a solid surface. 
25 In one embodiment, the patient is undergoing a therapeutic regimen to treat 

prostate cancer. In another embodiment, the patient is suspected of having metastatic 
prostate cancer. 

In one embodiment, the patient is a human. 

In one embodiment, the patient is suspected of having a taxol-resistant cancer. 
30 In one embodiment, the prostate cancer associated transcript is mRNA. 



3 



WO 02/30268 



PCT7US01/32045 



In one embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the 
efficacy of a therapeutic treatment of prostate cancer, the method comprising the steps of: (i) 
5 providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
contacting the biological sample with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 
the efficacy of the therapy. In a further embodiment, the patient has metastatic prostate 
10 cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

In one embodiment, the method further comprises the step of: (iii) comparing 
the level of the prostate cancer-associated transcript to a level of the prostate cancer- 
associated transcript in a biological sample from the patient prior to, or earlier in, the 

15 therapeutic treatment. 

Additionally, provided herein is a method of evaluating the effect of a 
candidate prostate cancer drug comprising administering the drug to a patient and removing a 
cell sample from the patient. The expression profile of the cell is then determined. This 
method may further comprise comparing the expression profile to an expression profile of a 

20 healthy individual. In a preferred embodiment, said expression profile includes a gene of 
Tables 1-16. 

In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-16. 

In one embodiment, an expression vector or cell comprises the isolated nucleic 

25 acid. 

In one aspect, the present invention provides an isolated polypeptide which is 
encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1-16. 

In another aspect, the present invention provides an antibody that specifically 
binds to an isolated polypeptide which is encoded by a nucleic acid molecule having 
30 polynucleotide sequence as shown in Tables 1-16. 
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In one embodiment, the antibody is conjugated to an effector component, e.g., 
a fluorescent label, a radioisotope or a cytotoxic chemical. 

In one embodiment, the antibody is an antibody fragment. In another 
embodiment, the antibody is humanized. 
5 In one aspect, the present invention provides a method of detecting a prostate 

cancer cell in a biological sample from a patient, the method comprising contacting the 
biological sample with an antibody as described herein. 

In another aspect, the present invention provides a method of detecting 
antibodies specific to prostate cancer in a patient, the method comprising contacting a 
10 biological sample from the patient with a polypeptide encoded by a nucleic acid comprising a 
sequence from Tables 1-16. 

Li another aspect, the present invention provides a method for identifying a 
compound that modulates a prostate cancer-associated polypeptide, the method comprising 
the steps of: (i) contacting the compound with a prostate cancer-associated polypeptide, the 
15 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 
80% identical to a sequence as shown in Tables 1-16; and (ii) determining the functional 

» 

effect of the compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic 
effect, or a chemical effect. 
20 In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 

cell membrane. In another embodiment, the polypeptide is recombinant. 

In one embodiment, the functional effect is determined by measuring ligand 
binding to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting 
25 proliferation of a prostate cancer-associated cell to treat prostate cancer in a patient, the 

method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. 

In one embodiment, the compound is an antibody. 
In another aspect, the present invention provides a drug screening assay 
30 comprising the steps of: (i) administering a test compound to a mammal having prostate 

cancer or to a cell sample isolated therefrom; (ii) comparing the level of gene expression of a 

5 
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polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 
as shown in Tables 1-16 in a treated cell or mammal with the level of gene expression of the 
polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
5 cancer. 

In one embodiment, the control is a mammal with prostate cancer or a cell 
sample therefrom that has not been treated with the test compound. In another embodiment, 
the control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
10 concentrations. In another embodiment, the test compound is administered for varying time 
periods. In another embodiment, the comparison can occur after addition or removal of the 
drug candidate. 

In one embodiment, the levels of a plurality of polynucleotides that selectively 

hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16 are 
15 individually compared to their respective levels in a control cell sample or mammal. In a 

preferred embodiment the plurality of polynucleotides is from three to ten. 

In another aspect, the present invention provides a method for treating a 

mammal having prostate cancer comprising administering a compound identified by the 

assay described herein. 
20 In another aspect, the present invention provides a pharmaceutical 

composition for treating a mammal having prostate cancer, the composition comprising a 

compound identified by the assay described herein and a physiologically acceptable 

excipient. 

In one aspect, the present invention provides a method of screening drug 
25 candidates by providing a cell expressing a gene that is up- and down-regulated as in a 
prostate cancer. In one embodiment, a gene is selected from Tables 1-16. The method 
further includes adding a drug candidate to the cell and determining the effect of the drug 
candidate on the expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes 
30 comparing the level of expression in the absence of the drug candidate to the level of 
expression in the presence of the drug candidate, wherein the concentration of the drug 
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candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drug candidate. In a preferred embodiment, the cell expresses at least two 
expression profile genes. The profile genes may show an increase or decrease. 

Also provided is a method of evaluating the effect of a candidate prostate 
5 cancer drug comprising administering the drug to a transgenic animal expressing or 

over-expressing the prostate cancer modulatory protein, or an animal lacking the prostate 
cancer modulatory protein, for example as a result of a gene knockout 

Moreover, provided herein is a biochip comprising one or more nucleic acid 
segments of Tables 1-16, wherein the biochip comprises fewer than 1000 nucleic acid probes. 
10 Preferably, at least two nucleic acid segments are included. More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate 
cancer is provided. The method comprises determining the expression of a gene of Tables 1- 
16, in a first tissue type of a first individual, and comparing the distribution to the expression 
15 of the gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

In a further embodiment, the biochip also includes a polynucleotide sequence 
of a gene that is not up- and down-regulated in prostate cancer. 
20 In one embodiment a method for screening for a bioactive agent capable of 

interfering with the binding of a prostate cancer modulating protein (prostate cancer 
modulatory protein) or a fragment thereof and an antibody which binds to said prostate 
cancer modulatory protein or fragment thereof . In a preferred embodiment, the method 
comprises combining a prostate cancer modulatory protein or fragment thereof, a candidate 
25 bioactive agent and an antibody which binds to said prostate cancer modulatory protein or 
fragment thereof. The method further includes determining the binding of said prostate 
cancer modulatory protein or fragment thereof and said antibody. Wherein there is a change 
in binding, an agent is identified as an interfering agent. The interfering agent can be an 
. agonist or an antagonist. Preferably, the agent inhibits prostate cancer. 
30 Also provided herein are methods of eliciting an immune response in an 

individual. In one embodiment a method provided herein comprises administering to an 
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individual a composition comprising a prostate cancer modulating protein, or a fragment 
thereof. In another embodiment, the protein is encoded by a nucleic acid selected from those 
of Tables 1-16. 

Further provided herein are compositions capable of eliciting an immune 
5 response in an individual. In one embodiment, a composition provided herein comprises a 
prostate cancer modulating protein, preferably encoded by a nucleic acid of Tables 1-16, or a 
fragment thereof, and a pharmaceutically acceptable carrier. In another embodiment, said 
composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables 1-16, and a 

10 pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer 
protein, or a fragment thereof, comprising contacting an agent specific for said protein with 
said protein in an amount sufficient to effect neutralization. In another embodiment, the 
protein is encoded by a nucleic acid selected from those of Tables 1-16. 

15 In another aspect of the invention, a method of treating an individual for 

prostate cancer is provided. In one embodiment, the method comprises administering to said 
individual an inhibitor of a prostate cancer modulating protein. In another embodiment, the 
method comprises administering to a patient having prostate cancer an antibody to a prostate 
cancer modulating protein conjugated to a therapeutic moiety. Such a therapeutic moiety can 

20 be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and prognosis evaluation for prostate cancer (PC), including 
25 metastatic prostate cancer, as well as methods for screening for compositions which modulate 
prostate cancer. Also provided are methods for treating prostate cancer. 

In addition to the other nucleic acid and peptide sequences, the present 
invention also relates to the identification of PAA2 as a gene that is highly over expressed in 
prostate cancer patient tissues. PAA2 sequence is identical to the zinc transporter ZNT4. 
30 Results presented herein demonstrate that PAA2/ZNT4 is highly expressed in prostate cancer 
cells. The prostate gland is unique in that it has the highest capacity of any organ in the body 
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to accumulate zinc. Zinc uptake is regulated by prolactin and testosterone, which induce the 
expression of a member of the ZIP family of zinc transporters (Costello et aL, 1999, J. Biol. 
Chem. 274:17499-17504). Zinc accumulation in the prostate functions to inhibit citrate 
oxidation, which results in a decrease in cellular ATP production (Costello and Franklin, 
5 1998, Prostate 35:285-296). Cancer cells are more sensitive to decreased ATP production 
and have evolved to prevent zinc accumulation. Without wishing to be bound by theory, the 
up-regulation of ZNT4 in prostate cancer cells may result in protection of the cells from high 
zinc levels by its ability to pump accumulated zinc out of the cells. 

The present invention also relates to nucleic acid sequencess encoding PBH1. 

10 PBH1 is related to human TRPC7 (transient receptor potential-related channels, NP_003298), 
a putative calcium channel highly expressed in brain (Nagamine et aL, Genomics 54:124-131 
(1998)). Trp is related to melastatin, a gene down-regulated in metastatic melanomas 
(Duncan et aL, Cancer Res. 58:1515-1520 (1998)), and MTR1, a gene locallized to within the 
Beckwith-Wiedemann syndrome/Wilm's tumor susceptability region (Prawitt et aL, Hum. 

15 Mol. Genet. 9:203-216 (2000)). Without wishing to be bound by theory, it is believed that 
PBH1 functions as a calcium channel. 

As a calcium channel, PBH1 is an ideal target for a small molecule 
therapeutic, or a therapeutic antibody that disrupts channel function. CD20, the target of 
Rituximab in non-Hodgekin's lymphoma (Maloney et aL, Blood 90:2188-2195 (1997); Leget 

20 and Czuczman, Curr. Opin. Oncol. 10:548-551 (1998)), is a plasma membrane calcium 
channel expressed in B cells (Tedder and Engel, Immunol. Today 15:450-454 (1994)). 
Similarly, a small molecule, or antibody that inhibits or alters a calcium signal mediated by 
PBH1, will result in the death of prostate cancer cells. 

PBH1, and other genes of the invention, are also be useful as targets for 

25 cytotoxic T-lymphocytes. Genes that are tumor specific, or that are expressed in immune- 
privileged organs, are currently being used as potential vaccine targets (Van den Eynde and 
Boon, Int. I Clin. Lab. Res. 27:81-86 (1997)). The expression pattern of PBH1 indicates that 
it is an ideal target for cytotoxic T-lymphocytes. Thus, therapies that utilize PBHl-specific 
cytotoxic T-lymphocytes to induce prostate cancer cell death are also provided by this 

30 invention. See, e.g., U.S. Patent No. 6,051,227 and WO 00/32231, the disclosures of which 
are herein incorporated by reference. 
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The present invention is also related to the identification of PAA3 as a gene 
that is important in the modulation of prostate cancer and or breast cancer. 

Tables 1-16 provide unigene cluster identification numbers, exemplar 
accession numbers, or genomic nucleotide position numbers for the nucleotide sequence of 
5 genes that exhibit increased or decreased expression in prostate cancer samples. 



Definitions 

The term "prostate cancer protein" or "prostate cancer polynucleotide" or 
"prostate cancer-associated transcript" refers to nucleic acid and polypeptide polymorphic 

10 variants, alleles, mutants, and interspecies homologues that: (1) have a nucleotide sequence 
that has greater than about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 
90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide 
sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 
500, 1000, or more nucleotides, to a nucleotide sequence of or associated with a unigene 

15 cluster of Tables 1-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an 
immunogen comprising an amino acid sequence encoded by a nucleotide sequence of or 
associated with a unigene cluster of Tables 1-16, and conservatively modified variants 
thereof; (3) specifically hybridize under stringent hybridization conditions to a nucleic acid 
sequence, or the complement thereof of Tables 1-16 and conservatively modified variants 

20 thereof or (4) have an amino acid sequence that has greater than about 60% amino acid 

sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98% or 99% or greater amino sequence identity, preferably over a region of over 
a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid 
sequence encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 

25 1-16. A polynucleotide or polypeptide sequence is typically from a mammal including, but 
not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, 
or other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide," 
include both naturally occurring or recombinant forms, 

A "full length" prostate cancer protein or nucleic acid refers to a prostate 

30 cancer polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the 
elements normally contained in one or more naturally occurring, wild type prostate cancer 
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polynucleotide or polypeptide sequences. For example, a full length prostate cancer nucleic 
acid will typically comprise all of the exons that encode for the full length, naturally ocurring 
protein. The "full length" may be prior to, or after, various stages of post-translation 
processing or splicing, including alternative splicing. 
5 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

10 blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues. A 
biological sample is typically obtained from a eukaryotic organism, most preferably a 
mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

15 "Providing a biological sample" means to obtain a biological sample for use in 

methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome history, will 

20 be particularly useful. 

The terms "identical" or percent "identity " in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 

25 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 

30 be "substantially identical." This definition also refers to, or may be applied to, the 

compliment of a test sequence. The definition also includes sequences that have deletions 

11 
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and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 

10 program parameters can be used, or alternative parameters can be designated The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
one of the number of contiguous positions selected from the group consisting typically of 

15 from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which 
a sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art. Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

20 Math. 2:482 (198 1), by the homology alignment algorithm of Needleman & Wunsch, J. Mol 
Biol 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'L 
Acad. ScL USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFTT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 

25 visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et aL 9 eds. 
1995 supplement)). 

Preferred examples of algorithms that are suitable for determining percent 
sequence identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, 
which are described in Altschul et al, Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et 

30 al, J. Mol. Biol 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the parameters 
described herein, to determine percent sequence identity for the nucleic acids and proteins of 
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the invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi-nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
5 threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al.> supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 

10 for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 
hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 

15 due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 

20 defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873- 

25 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 

30 preferably less than about 0.01, and most preferably less than about 0.001. Log values may 
be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, etc. 
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An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
5 polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
same primers can be used to amplify the sequences. 

10 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. colU or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like {see, e.g., the American Type Culture 

15 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

20 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an.electrophoretic gel. Preferably, it means 

25 that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 

30 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
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naturally occurring amino acid, as well as to naturally occurring amino acid polymers, those 
containing modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 
as well as amino acid analogs and amino acid mimetics that function similarly to the naturally 
5 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 

modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

15 Amino acids may be referred to herein by either their commonly known three 

letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 

20 acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

25 most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino 
acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 

30 polypeptide also describes silent variations of the nucleic acid. One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 
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only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 
5 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

10 Conservative substitution tables providing functionally similar amino acids are well known in 
the art. Such conservatively modified variants are in addition to and do not exclude 
polymorphic variants, interspecies homologs, and alleles of the invention.typically 
conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), 
Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) 

15 Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), 
Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) {see, 
e.g., Creighton, Proteins (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 

20 e.g., Alberts et aL 9 Molecular Biology of the Cell (3 rd ed„ 1994) and Cantor & Schimmel, 
Biophysical Chemistry Parti: The Conformation of Biological Macromolecules (1980). 
"Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 

25 often form a compact unit of the polypeptide and are typically 25 to approximately 500 
amino acids long. Typical domains are made up of sections of lesser organization such as 
stretches of (5-sheet and a-helices. 'Tertiary structure" refers to the complete three 
dimensional structure of a polypeptide monomer. "Quaternary structure" refers to the three 
dimensional structure formed, usually by the noncovalent association of independent tertiary 

30 units. Anisotropic terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical 
equivalents used herein means at least two nucleotides covalently linked together. 
Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more 
nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and 
5 polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 
1000, 2000, 3000, 5000, 7000, 10,000, etc. A nucleic acid of the present invention will 
generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are 
included that may have alternate backbones, comprising, e.g., phosphoramidate, 
phosphorothioate, phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein, 

10 Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and 

peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & 

15 Cook, eds.. Nucleic acids containing one or more carbocyclic sugars are also included within 
one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done 
for a variety of reasons, e.g. to increase the stability and half-life of such molecules in 
physiological environments or as probes on a biochip. Mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 

20 analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for 
example, phosphoramidate (Beaucage et al., Tetrahedron 49(10): 1925 (1993) and references 
therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl et al., Eur. J. Biochem. 81:579 
(1977); Letsinger et aL, Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 

25 (1984), Letsinger et al., J. Am. Chem. Soc. 110:4470 (1988); and Pauwels et al., Chemica 
Scripta 26: 141 91986)), phosphorothioate (Mag et al., Nucleic Acids Res. 19: 1437 (1991); 
and U.S. Patent No. 5,644,048), phosphorodithioate (Briu et al., J. Am. Chem. Soc. 111:2321 
(1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: 
A Practical Approach, Oxford University Press), and peptide nucleic acid backbones and 

30 linkages (see Egholm, J. Am. Chem. Soc. 1 14:1895 (1992); Meier et al., Chem. Int. Ed. Engl. 
31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all 
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of which are incorporated by reference). Other analog nucleic acids include those with 
positive backbones (Denpcy et al., Proc. Natl. Acad. Sci. USA 92:6097 (1995); non-ionic 
backbones (U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; 
Kiedrowshi et al., Angew. Chem. Intl. Ed. English 30:423 (1991); Letsinger et aL, J. Am. 
5 Chem. Soc. 1 10:4470 (1988); Letsinger et al., Nucleoside & Nucleotide 13: 1597 (1994); 
Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense 
Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal 
Chem. Lett. 4:395 (1994); Jeffs et al., J. Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 
37:743 (1996)) and non-ribose backbones, including those described in U.S. Patent Nos. 

10 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate 
Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids 
containing one or more carbocyclic sugars are also included within one definition of nucleic 
acids (see Jenkins et al., Chem. Soc. Rev. (1995) pp 169-176). Several nucleic acid analogs 
are described in Rawls, C & E News June 2, 1997 page 35. All of these references are hereby 

15 expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

20 kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 
due to their non-ionic nature, hybridization of the bases attached to these backbones is 
relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 

25 enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 
by those in the art, the depiction of a single strand also defines the sequence of the 
complementary strand; thus the sequences described herein also provide the complement of 

30 the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and 
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combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, 
xanthine hypoxanthine, isocytosine, isoguanine, etc. 'Transcript" typically refers to a 
naturally occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
5 nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g. the individual units of a peptide nucleic 
acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 

10 means. For example, useful labels include fluorescent dyes, electron-dense reagents, enzymes 
(e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins or other 
entities which can be made detectable, e.g., by incorporating a radiolabel into the peptide or 
used to detect antibodies specifically reactive with the peptide. The radioisotope may be, for 
example, 3H, 14C, 32P, 35S, or 1251. In some cases, particularly using antibodies against the 

15 proteins of the invention, the radioisotopes are used as toxic moieties, as described below. 
The labels may be incorporated into the prostate cancer nucleic acids, proteins and antibodies 
at any position. Any method known in the art for conjugating the antibody to the label may 
be employed, including those methods described by Hunter et al., Nature , 144:945 (1962); 
David et al., Biochemistry . 13:1014 (1974); Pain et al., J. Immunol. Meth. . 40:219 (1981); 

20 and Nygren, J. Histochem. and Cvtochem. , 30:407 (1982). The lifetime of radiolabeled 
peptides or radiolabeled antibody compositions may extended by the addition of substances 
that stablize the radiolabeled peptide or antibody and protect it from degradation. Any 
substance or combination of substances that stablize the radiolabeled peptide or antibody may 
be used including those substances disclosed in US Patent No. 5,961,955. 

25 An "effector" or "effector moiety" or "effector component" is a molecule that 

is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 

30 tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 
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A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
5 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

10 through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, 
or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodiester bond, so long as it does not 
functionally interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in 
which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. 

15 It will be understood by one of skill in the art that probes may bind target sequences lacking 
complete complementarity with the probe sequence depending upon the stringency of the 
hybridization conditions. The probes are preferably directly labeled as with isotopes, 
chromophobes, lumiphores, chromogens, or indirectly labeled such as with biotin to which a 
streptavidin complex may later bind. By assaying for the presence or absence of the probe, 

20 one can detect the presence or absence of the select sequence or subsequence. Diagnosis or 
prognosis may be based at the genomic level, or at the level of RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 

25 native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., 
recombinant cells express genes that are not found within the native (non-recombinant) form, 
of the cell or express native genes that are otherwise abnormally expressed, under expressed 
or not expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 

30 polymerases and endonucleases, in a form not normally found in nature. In this manner, 

operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
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form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
5 host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

10 The term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 

15 coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence, 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
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particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
5 duplexing, or hybridizing of a molecule only to a particular nucleotide sequence that is 
determinative of the presence of the nucleotide sequence, in a heterogeneous population of 
nucleic acids and other biologies (e.g., total cellular or library DNA or RNA). Similarly, the 
phrase "specifically (or selectively) binds" to an antibody or "specifically (or selectively) 
immunoreactive with," when referring to a protein or peptide, refers to a binding reaction that 
10 is determinative of the presence of the protein, in a heterogeneous population of proteins and 
other biologies. Thus, under designated immunoassay or nucleic acid hybridization 
conditions, the specified antibodies or nucleic acid probes bind to a particular protein 
nucleotide sequences at least two times the background and more typically more than 10 to 
100 times background. 

15 Specific binding to an antibody under such conditions requires an antibody 

that is selected for its specificity for a particular protein. For example, polyclonal antibodies 
raised to a particular protein, polymorphic variants, alleles, orthologs, and conservatively 
modified variants, or splice variants, or portions thereof, can be selected to obtain only those 
polyclonal antibodies that are specifically immunoreactive with the desired prostact cancer 

20 protein and not with other proteins. This selection may be achieved by subtracting out 

antibodies that cross-react with other molecules. A variety of immunoassay formats may be 
used to select antibodies specifically immunoreactive with a particular protein. For example, 
solid-phase ELIS A immunoassays are routinely used to select antibodies specifically 
immunoreactive with a protein {see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual 

25 (1988) for a description of immunoassay formats and conditions that can be used to 
determine specific immunoreactivity). 

The phrase "stringent hybridization conditions" refers to conditions under 
which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 

30 will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
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Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10°C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The T m is 
5 the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 

10 salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background; preferably 10 times background hybridization. Exemplary stringent 

15 hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 
incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 
0.1% SDS at 65°C. For PGR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

20 62°C is typical, although high stringency annealing temperatures can range from about 50°C 
to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 

25 reactions are provided, e.g., in Innis et dL (1990) PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc. N. Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon 

30 degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
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under moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 
background. Those of ordinary skill will readily recognize that alternative hybridization and 
5 wash conditions can be utilized to provide conditions of similar stringency. Additional 

guidelines for determining hybridization parameters are provided in numerous reference, e.g., 
and Current Protocols in Molecular Biology, ed. Ausubel, et ah 

The phrase "functional effects" in the context of assays for testing compounds 
that modulate activity of a prostate cancer protein includes the determination of a parameter 

10 that is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, 
e.g., a functional, physical, or chemical effect, such as the ability to decrease prostate cancer. 
It includes ligand binding activity; cell growth on soft agar; anchorage dependence; contact 
inhibition and density limitation of growth; cellular proliferation; cellular transformation; 
growth factor or serum dependence; tumor specific marker levels; invasiveness into Matrigel; 

15 tumor growth and metastasis in vivo; mRNA and protein expression in cells undergoing 

metastasis, and other characteristics of prostate cancer cells. "Functional effects" include in 
vitro , in vivo, and ex vivo activities. 

By "determining the'functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of a 

20 prostate cancer protein sequence, e.g., functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by any means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 
hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 

25 measuring binding activity or binding assays, e.g. binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 
the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 

30 transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 
The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 
5 measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, 0-gal, GFP and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of prostate cancer polynucleotide 
and polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

10 or compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide 
and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally 
block activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate 
the activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic 
acids may seem to inhibit expression and subsequent function of the protein. "Activators" 

15 are compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, 
or up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also 
include genetically modified versions of prostate cancer proteins, e.g., versions with altered 
activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

20 expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then determining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 
expression of 1 or more prostate cancer proteins, e.g., 1;2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 

25 or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1-16. 

Samples or assays comprising prostate cancer proteins that are treated with a 
potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 

30 (untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 
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preferably 50%, more preferably 25-0%. Activation of a prostate cancer polypeptide is 
achieved when the activity value relative to the control (untreated with activators) is 110%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
5 The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as formation of foci, anchorage 
independence, semi-solid or soft agar growth, changes in contact inhibition and density 
limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

10 ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney, Culture of Animal Cells a Manual of Basic 
Technique pp. 231-241 (3 rd ed. 1994). 

"Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells or t4 transformation" in tissue culture, refers 

15 to spontaneous or induced phenotypic changes that do not necessarily involve the uptake of 
new genetic material. Although transformation can arise from infection with a transforming 
virus and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 

20 aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney, 
Culture of Animal Cells a Manual of Basic Technique (3 rd ed. 1994)). 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 

25 epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 

30 Fundamental Immunology. 
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An exemplary immunoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
5 responsible for antigen recognition. The terms variable light chain (V L ) and variable heavy 
chain (V H ) refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well- 
characterized fragments produced by digestion with various peptidases. Thus, e.g., pepsin 
digests an antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 , a 

10 dimer of Fab which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)' 2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 
thereby converting the F(ab)' 2 dimer into an Fab' monomer. The Fab' monomer is 
essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 

15 antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 
herein, also includes antibody fragments either produced by the modification of whole 
antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al 9 Nature 

20 348:552-554(1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, 
Nature 256:495-497 (1975); Kozbor et al, Immunology Today 4:72 (1983); Cole et dL % pp. 
77-96 in Monoclonal Antibodies and Cancer Therapy (1985); Coligan, Current Protocols in 

25 Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, 
Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). Techniques for the 
production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce 
antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized antibodies. Alternatively, phage 

30 display technology can be used to identify antibodies and heteromeric Fab fragments that 
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specifically bind to selected antigens {see, e.g., McCafferty et al, Nature 348:552-554 
(1990); Marks et al, Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
5 (variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

10 

Identification of prostate cancer-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a "fingerprint" of the state of the 

15 sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 
characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from cancerous or metastatic cancerous tissue of the prostate, or 
prostate cancer tissue or metastatic prostate cancerous tissue can be compared with tissue 

20 samples of prostate and other tissues from surviving cancer patients. By comparing 

expression profiles of tissue in known different prostate cancer states, information regarding 
which genes are important (including both up- and down-regulation of genes) in each of these 
states is obtained. 

The identification of sequences that are differentially expressed in prostate 
25 cancer versus non-prostate cancer tissue allows the use of this information in a number of 

ways. For example, a particular treatment regime may be evaluated: does a chemotherapeutic 
drug act to down-regulate prostate cancer, and thus tumor growth or recurrence, in a 
particular patient. Similarly, diagnosis and treatment outcomes may be done or confirmed by 
comparing patient samples with the known expression profiles. Metastatic tissue can also be 
30 analyzed to determine the stage of prostate cancer in the tissue. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drug candidates with an eye to 
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mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 
comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
5 levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

10 Thus the present invention provides nucleic acid and protein sequences that 

are differentially expressed in prostate cancer, herein termed "prostate cancer sequences." As 
outlined below, prostate cancer sequences include those that are up-regulated (i.e., expressed 
at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., expressed 
at a lower level). In a preferred embodiment, the prostate cancer sequences are from humans; 

15 however, as will be appreciated by those in the art, prostate cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other 
prostate cancer sequences are provided, from vertebrates, including mammals, including 
rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, 
goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer sequences 

20 from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid 
sequences. As will be appreciated by those in the art and is more fully outlined below, 
prostate cancer nucleic acid sequences are useful in a variety of applications, including 
diagnostic applications, which will detect naturally occurring nucleic acids, as well as 

25 screening applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates 
with selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the prostate cancer sequences outlined herein. 
Such homology can be based upon the overall nucleic acid.or amino acid sequence, and is 

30 generally determined as outlined below, using either homology programs or hybridization 
conditions. 
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For identifying prostate cancer-associated sequences, the prostate cancer 
screen typically includes comparing genes identified in different tissues, e.g., normal and 
cancerous tissues, or tumor tissue samples from patients who have metastatic disease vs. non 
metastatic tissue. Other suitable tissue comparisons include comparing prostate cancer 
5 samples with metastatic cancer samples from other cancers, such as lung, breast, 

gastrointestinal cancers, ovarian, etc. Samples of different stages of prostate cancer, e.g., 
survivor tissue, drug resistant states, and tissue undergoing metastasis, are applied to biochips 
comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 

10 commercially available, e.g. from Affymetrix. Gene expression profiles as described herein 
are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between 
normal and disease states are compared to genes expressed in other normal tissues, preferably 
normal prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, 

15 muscle, colon, small intestine, large intestine, spleen, bone and placenta. In a preferred 
embodiment, those genes identified during the prostate cancer screen that are expressed in 
any significant amount in other tissues are removed from the profile, although in some 
embodiments, this is not necessary. That is, when screening for drugs, it is usually preferable 
that the target be disease specific, to minimize possible side effects. 

20 In a preferred embodiment, prostate cancer sequences are those that are up- 

regulated in prostate cancer; that is, the expression of these genes is higher in the prostate 
cancer tissue as compared to non-cancerous tissue. "Up-regulation" as used herein often 
means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. All unigene cluster identification numbers 

25 and accession numbers herein are for the GenBank sequence database and the sequences of 
the accession numbers are hereby expressly incorporated by reference. GenBank is known in 
the art, see, e.g., Benson, DA, et aL % Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 

30 In another preferred embodiment, prostate cancer sequences are those that are 

down-regulated in prostate cancer, that is, the expression of these genes is lower in prostate 
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cancer tissue as compared to non-cancerous tissue {see, e.g., Tables 8, 12 and 14). "Down- 
regulation" as used herein often means at least about a 1. 5-fold change more preferrably a 
two-fold change, preferably at least about a three fold change, with at least about five-fold or 
higher being most preferred, 

5 

Informatics 

The ability to identify genes that are over or under expressed in prostate 
cancer can additionally provide high-resolution, high-sensitivity datasets which can be used 

10 in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, protein 
structure, biosensor development, and other related areas. For example, the expression 
profiles can be used in diagnostic or prognostic evaluation of patients with prostate cancer. 
Or as another example, subcellular toxicological information can be generated to better direct 
drug structure and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets, 

15 Mechanism, and Function, paper presented at the BC Proteomics conference, Coronado, CA 
(June 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231)- Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 

20 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of assay data. The data contained in the database is acquired, e.g., 
using array analysis either singly or in a library format. The database can be in substantially 
any form in which data can be maintained and transmitted, but is preferably an electronic 

25 database. The electronic database of the invention can be maintained on any electronic 
device allowing for the storage of and access to the database, such as a personal computer, 
but is preferably distributed on a wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skill in the art that similar 

30 databases can be assembled for any assay data acquired using an assay of the invention. 
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The compositions and methods for identifying and/or quantitating the relative 
and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing prostate cancer, i.e., the identification of prostate cancer- 
associated sequences described herein, provide an abundance of information, which can be 
5 correlated with pathological conditions, predisposition to disease, drug testing, therapeutic 
monitoring, gene-disease causal linkages, identification of correlates of immunity and 
physiological status, among others. Although the data generated from the assays of the 
invention is suited for manual review and analysis, in a preferred embodiment, prior data 
processing using high-speed computers is utilized. 

10 An array of methods for indexing and retrieving biomolecular information is 

known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 
database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 

15 containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued and searched according to association with one or more sequencing projects 
for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

20 between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 

using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

25 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

30 the merger of two or more such tree structures. 
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See also Mount et al, Bioinformatics (2001); Biological Sequence Analysis: 
Probabilistic Models of Proteins and Nucleic Acids (Durbin etal., eds., 1999); 
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Baxevanis & 
Oeullette eds., 1998)); Rashidi & Buehler, Bioinformatics: Basic Applications in Biological 
5 Science and Medicine (1999); Introduction to Computational Molecular Biology (Setubal et 
al., eds 1997); Bioinformatics: Methods and Protocols (Misener & Krawetz, eds, 2000); 
Bioinformatics: Sequence, Structure, and Databanks: A Practical Approach (Higgins & 
Taylor, eds., 2000); Brown, Bioinformatics: A Biologist's Guide to Biocomputing and the 
Internet (2001); Han & Kamber, Data Mining: Concepts and Techniques (2000); and 

10 Waterman, Introduction to Computational Biology: Maps, Sequences, and Genomes (1995). 

The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

15 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for prostate cancer. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

20 target species in a sample: (1) a unique identification code, which can include, e.g., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

The invention also provides for the storage and retrieval of a collection of 

25 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

30 transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 
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the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides 
5 a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFTT) and/or the comparison may 
10 be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
(e.g., Iinux, SunOS, Solaris, ADC, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or 
15 hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing 
devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 
20 ISDN line, wireless network, optical fiber, or other suitable signal transmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 
cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 
25 generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 
30 comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
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degree of identity and gap weight to the target data. A central processor is preferably 
initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 
5 from the data file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to 
secondary memory, which is typically random access memory (e.g., DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the 

10 same characteristic of the query target and results are output via an I/O device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 
PA-8000, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 

15 device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, 
etc.)\ an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 

20 that described above, which comprises: (1) a computer, (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer, (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. ; 

25 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
30 cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
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proteins often results in unregulated or disregulated cellular processes (see, e.g., Molecular 
Biology of the Cell (Alberts, ed., 3rd ed., 1994). For example, many intracellular proteins 
have enzymatic activity such as protein kinase activity, protein phosphatase activity, protease 
activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins 
5 also serve as docking proteins that are involved in organizing complexes of proteins, or 
targeting proteins to various subcellular localizations, and are involved in maintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 
in the proteins of one or more motifs for which defined functions have been attributed In 

10 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SEC) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

15 targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

20 enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pf am (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St. Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman et al, Hue. 

25 Acids Res. 28:263-266 (2000); Sonnhammer et al, Proteins 28:405-420 (1997); Bateman et 
al, Nuc. Acids Res. 27:260-262 (1999); and Sonnhammer et al, Nuc. Acids Res. 26:320-322- 
(1998)). 

In another embodiment, the prostate cancer sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell, 
30 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
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for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
5 of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 
guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

10 transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 
transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 

15 may be followed by charged amino acids. Therefore, upon analysis of the amino acid 
sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted (see, e.g. PSORT web site http://psort.nibb.ac.jp/). 
Important transmembrane protein receptors include, but are not limited to the insulin 
receptor, insulin-like growth factor receptor, human growth hormone receptor, glucose 

20 transporters, transferrin receptor, epidermal growth factor receptor, low density lipoprotein 
receptor, epidermal growth factor receptor, leptin receptor, interleukin receptors, e.g. IL-1 
receptor, IL-2 receptor, 

The extracellular domains of transmembrane proteins are diverse; however, 
conserved motifs are found repeatedly among various extracellular domains. Conserved 

25 structure and/or functions have been ascribed to different extracellular motifs. Many 

extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 
For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 

30 bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 

37 



WO 02/30268 



PCT/US01/32045 



bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) 
anchor, or may themselves be transmembrane proteins. Extracellular domains also associate 
with the extracellular matrix and contribute to the maintenance of the cell structure. 
5 Prostate cancer proteins that are transmembrane are particularly preferred in 

the present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 

10 typically permeablized to provide access to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 
be made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

15 In another embodiment, the prostate cancer proteins are secreted proteins; the 

secretion of which can be either constitutive or regulated. These proteins have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 

20 an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 
on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Prostate cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 

25 for blood, plasma, serum, or stool tests. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by 
substantial nucleic acid and/or amino acid sequence homology or linkage to the prostate 
30 cancer sequences outlined herein. Such homology can be based upon the overall nucleic acid 
or amino acid sequence, and is generally determined as outlined below, using either 
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homology programs or hybridization conditions. Typically, linked sequences on a mRNA are 
found on the same molecule. 

The prostate cancer nucleic acid sequences of the invention, e.g., the 
sequences in Tables 1-16, can be fragments of larger genes, i.e., they are nucleic acid 
5 segments. "Genes" in this context includes coding regions, non-coding regions, and mixtures 
of coding and non-coding regions. Accordingly, as will be appreciated by those in the art, 
using the sequences provided herein, extended sequences, in either direction, of the prostate 
cancer genes can be obtained, using techniques well known in the art for cloning either longer 
sequences or the full length sequences; see Ausubel, et a/., supra. Much can be done by 

10 informatics and many sequences can be clustered to include multiple sequences 
corresponding to a single gene, e.g., systems such as UniGene (see, 
http://www.ncbi.nlm.nih.gov/UniGene/). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire prostate cancer nucleic acid 

15 coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 
segment, the recombinant prostate cancer nucleic acid can be further-used as a probe to 
identify and isolate other prostate cancer nucleic acids, e.g., extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant prostate cancer nucleic 

20 acids and proteins. 

The prostate cancer nucleic acids of the present invention are used in several 
ways. In a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are 
made and attached to biochips to be used in screening and diagnostic methods, as outlined 
below, or for administration, e.g., for gene therapy, vaccine, and/or antisense applications. 

25 Alternatively, the prostate cancer nucleic acids that include coding regions of prostate cancer 
proteins can be put into expression vectors for the expression of prostate cancer proteins, 
again for screening purposes or for administration to a patient. 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic 
acids (both the nucleic acid sequences outlined in the figures and/or the complements thereof) 

30 are made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the prostate cancer nucleic acids, i.e. the target sequence (either the target 
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sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
5 single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under normal reaction conditions, particularly high stringency 

10 conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded. The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 

15 and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
either overlapping probes or probes to different sections of the target being used. That is, 

20 two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target The probes can be overlapping (i.e., have some sequence in common), 
or separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

25 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 

30 attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
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equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
5 support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as 
will be appreciated by those in the art. As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
10 the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 
support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 

15 in the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 

20 plastics, etc. In general, the substrates allow optical detection and do not appreciably 

fluoresce. A preferred substrate is described in copending application entitled Reusable Low 
Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 1999, 
herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 

25 the art, other configurations of substrates may be used as well. For example, the probes may 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 
sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 

30 derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 
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amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g. using linkers as are known in the art; e.g., 
5 homo-or hetero-bifunctional linkers as are well known {see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

10 and then attached to the surface of the solid support. As will be appreciated by those skilled 
in the art, either the 5' or 3' terminus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

15 bind to surfaces covalently coated with streptavidin, resulting in attachment. 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/251 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression 
level of prostate cancer-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, a prostate cancer-associated nucleic 
acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain 
Reaction, or PCR). In a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of prostate cancer-associated RNA. Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
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quantitative PCR are provided, e.g., in Innis et al, PCR Protocols, A Guide to Methods and 
Applications (1990). 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5* fluorescent 
5 dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

10 amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see Wu & Wallace, Genomics 4:560 (1989), Landegren et al, Science 
241:1077 (1988), and Barringer etal, Gene 89:117 (1990)), transcription amplification 
(Kwoh et al., Proc. Natl. Acad. Sci. USA 86:1173 (1989)), self-sustained sequence replication 

15 (Guatelli et al., Proc. Nat. Acad. Sci. USA 87: 1874 (1990)), dot PCR, and linker adapter PCR, 
etc. 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding 
20 prostate cancer proteins are used to make a variety of expression vectors to express prostate 
cancer proteins which can then be used in screening assays, as described below. Expression 
vectors and recombinant DNA technology are well known to those of skill in the art (see, 
e.g., Ausubel, supra, and Gene Expression Systems (Fernandez & Hoeffler, eds, 1999)) and 
are used to express proteins. The expression vectors may be either self-replicating 
25 extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translational regulatory nucleic acid operably 
linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism. Control sequences that are suitable for prokaryotes, e.g., include a 
30 promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 
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Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 
5 to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. Linking is typically accomplished by ligation at convenient 

10 restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the prostate cancer 
protein. Numerous types of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of host cells. 

15 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

20 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

25 example, the expression vector may have two replication systems, thus allowing it to be 
maintained in two organisms, e.g. in mammalian or insect cells for expression and in a 
procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct. 

30 The integrating vector may be directed to a specific locus in the host cell by selecting the 
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appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art (e.g., Fernandez & Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a 
selectable marker gene to allow the selection of transformed host cells. Selection genes are 
5 well known in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing 
a host cell transformed with an expression vector containing nucleic acid encoding a prostate 
cancer protein, under the appropriate conditions to induce or cause expression of the prostate 
cancer protein. Conditions appropriate for prostate cancer protein expression will vary with 

10 the choice of the expression vector and the host cell, and will be easily ascertained by one 
skilled in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 

15 is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 
and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtitis, Sf9 cells, C129 cells, 293 cells, 

20 Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the prostate cancer proteins are expressed in 
mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. One expression vector system is a retroviral vector system 

25 such as is generally described in PCT/US97/01019 and PCT/US97/01048, both of which are 
hereby expressly incorporated by reference. Of particular use as mammalian promoters are 
the promoters from mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the S V40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 

30 and the CMV promoter {see, e.g., Fernandez & Hoeffler, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
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regions located 3' to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and polyadenlyation signals 
include those derived form S V40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as 
5 well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide^) in liposomes, and direct microinjection of the DNA 
into nuclei. 

10 In a preferred embodiment, prostate cancer proteins are expressed in bacterial 

systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the tip and lac 
promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

15 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

20 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 

25 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Fernandez & Hoeffler, supra). The bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 

30 the art, such as calcium chloride treatment, electroporation, and others. 
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In one embodiment, prostate cancer proteins are produced in insect cells. 
Expression vectors for the transformation of insect cells, and in particular, baculo virus-based 
expression vectors, are well known in the art. 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. 

5 Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, 
Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces porribe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using 

10 techniques well known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the 
desired epitope is small, the prostate cancer protein may be fused to a carrier protein to form 
an immunogen. Alternatively, the prostate cancer protein may be made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 

15 acid for expression purposes. 

In a preferred embodiment, the prostate cancer protein is purified or isolated 
after expression. Prostate cancer proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 

20 and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes, Protein 

25 Purification (1982). The degree of purification necessary will vary depending on the use of 
the prostate cancer protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the prostate cancer proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
reagents, as vaccine reagents, as screening agents, etc. 

30 
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Variants of prostate cancer proteins 

, In one embodiment, the prostate cancer proteins are derivative or variant 

prostate cancer proteins as compared to the wild-type sequence. That is, as outlined more 
fully below, the derivative prostate cancer peptide will often contain at least one amino acid 
5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred. 
The amino acid substitution, insertion or deletion may occur at any residue within the 
prostate cancer peptide. 

Also included within one embodiment of prostate cancer proteins of the 
present invention are amino acid sequence variants. These variants typically fall into one or 

10 more of three classes: substitutional, insertional or deletional variants. These variants 

ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the 
prostate cancer protein, using cassette or PCR mutagenesis or other techniques well known in 
the art, to produce DNA encoding the variant, and thereafter expressing the DNA in 
recombinant cell culture as outlined above. However, variant prostate cancer protein 

15 fragments having up to about 100-150 residues may be prepared by in vitro synthesis using 
established techniques, Amino acid sequence variants are characterized by the predetermined 
nature of the variation, a feature that sets them apart from naturally occurring allelic or 
interspecies variation of the prostate cancer protein amino acid sequence. The variants 
typically exhibit the same qualitative biological activity as the naturally occurring analogue, 

20 ..although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

25 conducted at the target codon or region and the expressed prostate cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

30 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably larger 
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insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 
arrive at a final derivative. Generally these changes are done on a few amino acids to 
5 minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The variants typically exhibit the same qualitative biological activity and will 

10 elicit the same immune response as the naturaUy-occurring analog, although variants also are 
selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by 

15 selecting substitutions that are less conservative than those described above. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 

20 polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 

25 chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

Covalent modifications of prostate cancer polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
acid residues of a prostate cancer polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of a prostate 

30 cancer polypeptide. Derivatization with Afunctional agents is useful, for instance, for 

crosslinking prostate cancer polypeptides to a water-insoluble support matrix or surface for 
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use in the method for purifying anti-prostate cancer polypeptide antibodies or screening 
assays, as is more fully described below. Commonly used crosslinking agents include, e.g., 
l,l-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., 
esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl 
5 esters such as 3,3*-dithiobis(succinimidylpropionate), Afunctional maleimides such as bis-N- 
maleimido-l,8-octane and agents such as methyl-3-((p-azidophenyl)dithio)propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 

10 methylation of the amino groups of the lysine, arginine, and histidine side chains (Creighton, 
Proteins: Structure and Molecular Properties, pp. 79-86 (1983)), acetylation of the N- 
terminal amine, and amidation of any C-tenninal caiboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

15 of the polypeptide. "Altering the native glycosylation pattern" is intended for purposes 

herein to mean deleting one or more carbohydrate moieties found in native sequence prostate 
cancer polypeptide, and/or adding one or more glycosylation sites that are not present in the 
native sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many 
ways. For example the use of different cell types to express prostate cancer-associated 

20 sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to prostate cancer polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 

25 amino acid sequence may optionally be altered through changes at the DNA level, 

particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 

Another means of increasing the number of carbohydrate moieties on the 
prostate cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 

30 polypeptide. Such methods are described in the art, e.g., in WO 87/05330, and in Aplin & 
Wriston, CRC Criu Rev. Biochenu, pp. 259-306 (1981). 
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Removal of carbohydrate moieties present on the prostate cancer polypeptide 
may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, 
5 et al., Arch. Biochem. Biophys., 259:52 (1987) and by Edge et a/., Anal. Biochem., 118:131 
(1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et al, Meth. 
EnzymoU 138:350(1987). 

Another type of covalent modification of prostate cancer comprises linking the 

10 prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in 
a way to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 

15 heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 

20 an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 

25 the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the 
art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field et 
al. t Mol. Cell Biol 8:2159-2165 (1988)); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 

30 9E10 antibodies thereto (Evan et al, Molecular and Cellular Biology 5:3610-3616 (1985)); 
and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky et al., 
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Protein Engineering 3(6):547-553 (1990)). Other tag polypeptides include the Hag-peptide 
(Hopp etal., BioTechnology 6:1204-1210 (1988)); the KT3 epitope peptide (Martin et dL 9 
Sciefice 255:192-194 (1992)); tubulin epitope peptide (Skinner et al., J. Biol Chem. 
266:15163-15166 (1991)); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth et aL, 
5 Proc. Natl Acad Sci. USA 87:6393-6397 (1990)). 

Also included are other prostate cancer proteins of the prostate cancer family, 
and prostate cancer proteins from other organisms, which are cloned and expressed as 
outlined below. Thus, probe or degenerate polymerase chain reaction (PGR) primer 
sequences may be used to find other related prostate cancer proteins from humans or other 

10 organisms. As will be appreciated by those in the art, particularly useful probe and/or PCR 
primer sequences include the unique areas of the prostate cancer nucleic acid sequence. As is 
generally known in the art, preferred PCR primers are from about 15 to about 35 nucleotides 
in length, with from about 20 to about 30 being preferred, and may contain inosine as needed. 
The conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols, 

15 supra). 

Antibodies to prostate cancer proteins 

In a preferred embodiment, when the prostate cancer protein is to be used to 
generate antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein 

20 should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes. In a preferred embodiment; the epitope is unique; that is, 

25 antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are known to the skilled artisan 
(e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 

30 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 
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may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 
trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 
5 adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 

dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 
antibodies may be prepared using hybridoma methods, such as those described by Kohler & 

10 Milstein, Nature 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro. The 
immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1- 

15 16 fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 

lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 
glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, 

20 pp. 59-103 (1986)). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 
the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

25 hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 

30 specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
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protein encoded by a nucleic acid Tables 1-16 or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

5 In a preferred embodiment, the antibodies to prostate cancer protein are 

capable of reducing or eliminating a biological function of a prostate cancer protein, as is 
described below. That is, the addition of anti-prostate cancer protein antibodies (either 
polyclonal or preferably monoclonal) to prostate cancer tissue (or cells containing prostate 
cancer) may reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in 
10 activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labs,Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
15 molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
20 human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
25 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
30 immunoglobulin (Jones et al 9 Nature 321:522-525 (1986); Riechmann et al y Nature 

332:323-329 (1988); andPresta, Curr. Op. Struct Biol 2:593-596 (1992)). Humanization 
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can be essentially performed following the method of Winter and co-workers (Jones et al, 
Nature 321:522-525 (1986); Riechmann et al, Nature 332:323-327 (1988); Verhoeyen et al, 
Science 239: 1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
5 chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom & Winter, J. Mol Biol 227:381 (1991); 

10 Marks et al, h Mol Biol 222:581 (1991)). The techniques of Cole et al and Boerner et al 
are also available for the preparation of human monoclonal antibodies (Cole et al, 
Monoclonal Antibodies and Cancer Therapy, p. 77 (1985) and Boerner etal, J. Immunol 
147(l):86-95 (1991)). Similarly, human antibodies can be made by introducing of human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

15 immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in all 
respects, including gene rearrangement, assembly, and antibody repertoire. This approach is 
described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 
5,661,016, and in the following scientific publications: Marks et al, Bio/Technology 10:779- 

20 783 (1992); Lonberg et al., Nature 368:856-859 (1994); Morrison, Nature 368:812-13 
(1994); Fishwild et al., Nature Biotechnology 14:845-51 (1996); Neuberger, Nature 
Biotechnology 14:826 (1996); Lonberg & Huszar, Intern. Rev. Immunol 13:65-93 (1995). 

By immunotherapy is meant treatment of prostate cancer with an antibody 
raised against prostate cancer proteins. As used herein,immunotherapy can be passive or 

25 active. Passive immunotherapy as defined herein is the passive transfer of antibody to a 

recipient (patient). Active immunization is the induction of antibody and/or T-cell responses 
in a recipient (patient). Induction of an immune response is the result of providing the 
recipient with an antigen to which antibodies are raised. As appreciated by one of ordinary 
skill in the art, the antigen may be provided by injecting a polypeptide against which 

30 antibodies are desired to be raised into a recipient, or contacting the recipient with a nucleic 
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acid capable of expressing the antigen and under conditions for expression of the antigen, 
leading to an immune response. 

In a preferred embodiment the prostate cancer proteins against which 
antibodies are raised are secreted proteins as described above. Without being bound by 
5 theory, antibodies used for treatment, bind and prevent the secreted protein from binding to 
its receptor, thereby inactivating the secreted prostate cancer protein. 

In another preferred embodiment, the prostate cancer protein to which 
antibodies are raised is a transmembrane protein. Without being bound by theory, antibodies 
used for treatment, bind the extracellular domain of the prostate cancer protein and prevent it 

10 from binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also an antagonist of the prostate cancer protein. 

15 Further, the antibody prevents activation of the transmembrane prostate cancer protein. In 
one aspect, when the antibody prevents the binding of other molecules to the prostate cancer 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-a, TNF-fJ, ILrl, INF-y 
and IL-2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

20 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, prostate cancer is 
treated by administering to a patient antibodies directed against the transmembrane prostate 
cancer protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

25 otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector 
moiety. The effector moiety can be any number of molecules, including labelling moieties 
such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect 
the therapeutic moiety is a small molecule that modulates the activity of the prostate cancer 

30 protein. In another aspect the therapeutic moiety modulates the activity of molecules 

associated with or in close proximity to the prostate cancer protein. The therapeutic moiety 
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may inhibit enzymatic activity such as protease or collagenase or protein kinase activity 
associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent. In this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results 
5 in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 

10 radiochemicals made by conjugating radioisotopes to antibodies raised against prostate 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 
attached to the antibody. Targeting the therapeutic moiety to transmembrane prostate cancer 
proteins not only serves to increase the local concentration of therapeutic moiety in the 
prostate cancer afflicted area, but also serves to reduce deleterious side effects that may be 

1 5 associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

20 the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate 
cancer proteins. By "specifically bind" herein is meant that the antibodies bind to the protein 

25 with a K<j of at least about 0. 1 mM, more usually at least about 1 pM, preferably at least about 
0.1 nM or better, and most preferably, 0.01 \M or better. Selectivity of binding is also 
important. 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

30 In one aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. Expression levels of genes in normal tissue 
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(i.e., not undergoing prostate cancer) and in prostate cancer tissue (and in some cases, for 
varying severities of prostate cancer that relate to prognosis, as outlined below) are evaluated 
to provide expression profiles. An expression profile of a particular cell state or point of 
development is essentially a "fingerprint" of the state. While two states may have any 
5 particular gene similarly expressed, the evaluation of a number of genes simultaneously 

allows the generation of a gene expression profile that is reflective of the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 

10 sample has the gene expression profile of normal or cancerous tissue. This will provide for 
molecular diagnosis of related conditions. 

'Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 

15 qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
normal versus prostate cancer tissue. Genes may be turned on or turned off in a particular 
state, relative to another state thus permitting comparison of two or more states. A 
qualitatively regulated gene will exhibit an expression pattern within a state or cell type 
which is detectable by standard techniques. Some genes will be expressed in one state or cell 

20 type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in 
that expression is increased or decreased; i.e., gene expression is either upregulated, resulting 
in an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

25 GeneChip™ expression arrays, Lockhart, Nature Biotechnology 14: 1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

30 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 



WO 02/30268 



PCT/US01/32045 



Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the prostate 
5 cancer protein and standard immunoassays (ELISAs, etc.) or other techniques, including 
mass spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
prostate cancer genes, i.e., those identified as being important in a prostate cancer phenotype, 
can be evaluated in a prostate cancer diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed 

10 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. The assays are further described below in the example. PCR techniques 

15 can be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein 
are detected. Although DNA or RNA encoding the prostate cancer protein may be detected, 
of particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

20 complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specific&lly bound probe, the label is 

25 detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding a 

30 prostate cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
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secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
5 assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. This can be performed on 
an individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
10 polypeptides. 

As described and defined herein, prostate cancer proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of prostate cancer. 
Detection of these proteins in putative prostate cancer tissue allows for detection or diagnosis 
of prostate cancer. In one embodiment, antibodies are used to detect prostate cancer proteins. 

15 A preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the prostate cancer protein is 
detected, e.g., by immunoblotting with antibodies raised against the prostate cancer protein. 
Methods of immunoblotting are well known to those of ordinary skill in the art. 

20 In another preferred method, antibodies to the prostate cancer protein find use 

in in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one 
to many antibodies to the prostate cancer protein(s). Following washing to remove non- 
specific antibody binding, the presence of the antibody or antibodies is detected. In one 

25 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label. In another method the primary antibody to the prostate cancer protein(s) 
contains a detectable label, e.g. an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
detectable label. This method finds particular use in simultaneous screening for a plurality of 

30 prostate cancer proteins. As will be appreciated by one of ordinary skill in the art, many 
other histological imaging techniques are also provided by the invention. 
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In a preferred embodiment the label is detected in a fluorometer which has the 
ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate 
5 cancer from blood, serum, plasma, stool, and other samples. Such samples, therefore, are 
useful as samples to be probed or tested for the presence of prostate cancer proteins. 
Antibodies can be used to detect a prostate cancer protein by previously described 
immunoassay techniques including ELISA, immunoblotting (western blotting), 
immunoprecipitation, BIACORE technology and the like. Conversely, the presence of 
10 antibodies may indicate an immune response against an endogenous prostate cancer protein. 

In a preferred embodiment, in situ hybridization of labeled prostate cancer 
nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
prostate cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., 
Ausubel, supra) is then performed. When comparing the fingerprints between an individual 
15 and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on 
the findings. It is further understood that the genes which indicate the diagnosis may differ 
from those which indicate the prognosis and molecular profiling of the condition of the cells 
may lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

20 In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 

acids, modified proteins and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer, 
in terms of long term prognosis. Again, this may be done on either a protein or gene level, 
with the use of genes being preferred. As above, prostate cancer probes may be attached to 

25 biochips for the detection and quantification of prostate cancer sequences in a tissue or 

patient The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

Assays for therapeutic compounds 

30 In a preferred embodiment members of the proteins, nucleic acids, and 

antibodies as described herein are used in drug screening assays. The prostate cancer 
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proteins, antibodies, nucleic acids, modified proteins and cells containing prostate cancer 
sequences are used in drug screening assays or by evaluating the effect of drug candidates on 
a "gene expression profile" or expression profile of polypeptides. In a preferred embodiment, 
the expression profiles are used, preferably in conjunction with high throughput screening 
5 techniques to allow monitoring for expression profile genes after treatment with a candidate 
agent (e.g., Zlokarnik, et al.> Science 279:84-8 (1998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 
acids, modified proteins and cells containing the native or modified prostate cancer proteins 
are used in screening assays. That is, the present invention provides novel methods for 

10 screening for compositions which modulate the prostate cancer phenotype or an identified 
physiological function of a prostate cancer protein. As above, this can be done on an 
individual gene level or by evaluating the effect of drug candidates on a "gene expression 
profile". In a preferred embodiment, the expression profiles are used, preferably in 
conjunction with high throughput screening techniques to allow monitoring for expression 

15 profile genes after treatment with a candidate agent, see Zlokarnik, supra. 

Having identified the differentially expressed genes herein, a variety of assays 
may be executed. In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in prostate cancer, 
test compounds can be screened for the ability to modulate gene expression or for binding to 

20 the prostate cancer protein. "Modulation" thus includes both an increase and a decrease in 
gene expression. The preferred amount of modulation will depend on the original change of 
the gene expression in normal versus tissue undergoing prostate cancer, with changes of at 
least 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300- 
1000% or greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue 

25 compared to normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold 
decrease in prostate cancer tissue compared to normal tissue often provides a target value of a 
10-fold increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 

30 be monitored, e.g., through the use of antibodies to the prostate cancer protein and standard 

62 



WO 02/30268 



PCT/US01/32045 



immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. . 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entities, i.e., an expression profile, is monitored simultaneously. Such profiles will 
5 typically involve a plurality of those entities described herein.. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microliter plate, 
may be used with dispensed primers in desired wells. A PCR reaction can then be performed 
10 and analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
sequence set out in Tables 1-16. Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
15 that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 
protein, or interfere with the binding of a prostate cancer protein and an antibody or other 
binding partner. 

The term "test compound" or "drug candidate" or "modulatof ' or grammatical 
equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic 

20 molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 

indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g. to a normal tissue 

25 fingerprint. In another embodiment, a modulator induced a prostate cancer phenotype. 

Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 
concentrations serves as a negative control, i.e., at zero concentration or below the level of 
detection. 

30 Drug candidates encompass numerous chemical classes, though typically they 

are organic molecules, preferably small organic compounds having a molecular weight of 
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more than 100 and less than about 2,500 daltons. Preferred small molecules are less than 
2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents comprise 
functional groups necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, 

5 preferably at least two of the functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic 
structures substituted with one or more of the above functional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred 

10 are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer 
protein. By "neutralize" is meant that activity of a protein is inhibited or blocked and the 
consequent effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
15 screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
20 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
25 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical 
compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
30 chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
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length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 
compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop etal, J. Med. Chem. 37(9): 1233-1251 (1994)), 

Preparation and screening of combinatorial chemical libraries is well known to 
5 those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka, Pept. Prot. Res. 37:487^93 
(1991), Houghton et al 9 Nature, 354:84-88 (1991)), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 

10 hydantoins, benzodiazepines and dipeptides (Hobbs et aL, Proc. Nat. Acad. Sci. USA 
90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al 9 J. Amer. Chem. Soc. 
114:6568 (1992)), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding 
(Hirschmann etal, J. Amer. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses 
of small compound libraries (Chen et a/., J. Amer. Chem. Soc. 116:2661 (1994)), 

15 oligocarbamates (Cho, et al, Science 261: 1303 (1993)), and/or peptidyl phosphonates 

(Campbell et al, J. Org. Chem. 59:658 (1994)). See, generally, Gordon et aL, J. Med. Chem. 
37:1385 (1994), nucleic acid libraries (see, e.g., Strategene, Corp.), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., Vaughn et al, Nature 
Biotechnology 14(3):309-314 (1996), and PCT/US96/10287), carbohydrate libraries (see, 

20 e.g., Iiang et al., Science 274:1520-1522 (1996), and U.S. Patent No. 5,593,853), and small 
organic molecule libraries (see, e.g., benzodiazepines, Baum, C&EN, Jan 18, page 33 (1993); 
isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 
5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, 
U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent No. 5,288,5 14; and the like). 

25 Devices for the preparation of combinatorial libraries are commercially 

available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
30 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
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Japan) and many robotic systems utilizing robotic arms (Zymate n, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.), which mimic the manual 
synthetic operations performed by a chemist. Any of the above devices are suitable for use 
with the present invention. The nature and implementation of modifications to these devices 
5 (if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available {see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
Columbia, MD, etc.). 

10 The assays to identify modulators are amenable to high throughput screening. 

Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other 

15 properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

20 throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
(see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). TTiese systems 
typically automate entire procedures, including all sample and reagent pipetting, liquid 

25 dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 

30 transcription, ligand binding, and the like. 
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In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 
proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of proteins may be made for screening in the methods of the invention, 
5 Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 

mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Particularly useful test compound will be directed to the class of proteins to which 
the target belongs, e.g., substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 

10 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 
to about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 
"randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 

15 these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. The synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 
most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

20 In one embodiment, the library is fully randomized, with no sequence 

preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, e.g., of hydrophobic amino acids, 

25 hydrophilic residues, sterically biased (either small or large) residues, towards the creation of 
nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 
domains, serines, threonines, tyrosines or histidines for phosphorylation sites, etc, or to 
purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined below. As 
30 described above generally for proteins, nucleic acid modulating agents may be naturally 
occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. For 
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example, digests of procaryotic or eucaryotic genomes may be used as is outlined above for 
proteins. 

In certain embodiments, the activity of a prostate cancer-associated protein is 
down-regulated, or entirely inhibited, by the use of antisense polynucleotide, ue. % a nucleic 
5 acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise 

10 naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the prostate cancer 

15 protein mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant 
means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 

20 Antisense molecules as used herein include antisense or sense 

oligonucleotides. Sense oligonucleotides can, e.g., be employed to block transcription by 
binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for prostate cancer molecules. Antisense or sense 

25 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
is described in, e.g., Stein & Cohen (Cancer Res. 48:2659 (1988 and van der Krol et al 
(BioTechniques 6:958 (1988)). 

30 In addition to antisense polynucleotides, ribozymes can be used to target and 

inhibit transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an 
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RNA molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes 
have been described, including group I ribozymes, hammerhead ribozymes, hairpin 
ribozymes, RNase P, and axhead ribozymes {see, e.g., Castanotto et al. y Adv. in 
Pharmacology 25: 289-317 (1994) for a general review of the properties of different 
5 ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel et al, 
Nucl Acids Res. 18:299-304 (1990); European Patent Publication No. 0 360 257; U.S. Patent 
No. 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., 
WO 94/26877; Ojwang et al, Proc. Natl. Acad. Sci. USA 90:6340-6344 (1993); Yamada et 

10 at, Human Gene Therapy 1:39-45 (1994); Leavitt et al, Proc. Natl. Acad Sci. USA 92:699- 
703 (1995); Leavitt et al. , Human Gene Therapy 5: 1 151-120 (1994); and Yamada et al., 
Virology 205: 121-126 (1994)). 

Polynucleotide modulators of prostate cancer may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 

15 molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 
bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 

20 or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
formation of an polynucleotide-lipid complex, as described in WO 90/10448, It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

25 As noted above, gene expression monitoring is conveniently used to test 

candidate modultors (e.g., protein, nucleic acid or small molecule). After the candidate agent 
has been added and the cells allowed to incubate for some period of time, the sample 
containing a target sequence to be analyzed is added to the biochip. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 

30 lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 

amplification such as PCR performed as appropriate. For example, an in vitro transcription 

69 



WO 02/30268 



PCT/US01/32045 



with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FTTC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a 
fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a means of 
5 detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 
such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 
appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 

10 epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 

15 probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

5.594.117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 

5.594.118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 

20 conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 

25 altering a step parameter that is a thermodynamic variable, including, but not limited to, 

temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally outlined in U.S. Patent No. 5,68 1,697. Thus it may be desirable to perform certain 

30 steps at higher stringency conditions to reduce non-specific binding. 
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The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. In addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 
5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 

The assay data are analyzed to determine the expression levels, and changes in 

10 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer 
phenotype. In one embodiment, screening is performed to identify modulators that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. In another embodiment, e.g., for diagnostic applications, having identified 

15 differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. In an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 
expression product of a differentially expressed gene. Again, having identified the 
importance of a gene in a particular state, screens are performed to identify agents that bind 

20 and/or modulate the biological activity of the gene product. 

In addition screens can be done for genes that are induced in response to a 
candidate agent. After identifying a modulator based upon its ability to suppress a prostate 
cancer expression pattern leading to a normal expression pattern, or to modulate a single 
prostate cancer gene expression profile so as to mimic die expression of the gene from 

25 normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent. Comparing expression profiles between 
normal tissue and agent treated prostate cancer tissue reveals genes that are not expressed in 
normal tissue or prostate cancer tissue, but are expressed in agent treated tissue. These agent- 
specific sequences can be identified and used by methods described herein for prostate cancer 

30 genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
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agent induced proteins and used to target novel therapeutics to the treated prostate cancer 
tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of 
prostate cancer cells, that have an associated prostate cancer expression profile. By 
5 "administration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 
encoding a proteinaceous candidate agent (i.e., a peptide) may be put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 
10 the peptide agent is accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 

15 generated, as outlined herein. 

Thus, e.g., prostate cancer tissue may be screened for agents that modulate, 
e.g., induce or suppress the prostate cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on prostate 
cancer activity. By defining such a signature for the prostate cancer phenotype, screens for 

20 new drugs that alter the phenotype can be devised. With this approach, the drug target need 
not be known and need not be represented in the original expression screening platform, nor 
does the level of transcript for the target protein need to change. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 

25 differentially expressed gene as important in a particular state, screening of modulators of 

either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the full length protein to the fragment encoded by the nucleic 

30 acids of Tables 1-16. Preferably, the prostate cancer modulatory protein is a fragment. In a 
preferred embodiment, the prostate cancer amino acid sequence which is used to determine 
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sequence identity or similarity is encoded by a nucleic acid of Tables 1-16. In another 
embodiment, the sequences are naturally occurring allelic variants of a protein encoded by a 
nucleic acid of Tables 1-16. In another embodiment, the sequences are sequence variants as 
further described herein. 
5 Preferably, the prostate cancer modulatory protein is a fragment of 

approximately 14 to 24 amino acids long. More preferably the fragment is a soluble 
fragment. Preferably, the fragment includes a non-transmembrane region. In a preferred 
embodiment, the fragment has an N-terminal Cys to aid in solubility. In one embodiment, the 
C-terminus of the fragment is kept as a free acid and the N-terminus is a free amine to aid in 

10 coupling, i.e., to cysteine. 

In one embodiment the prostate cancer proteins are conjugated to an 
immunogenic agent as discussed herein. In one embodiment the prostate cancer protein is 
conjugated to BS A. 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or 

15 the prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 
measured by examining parameters described above. A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 

20 animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 
release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth or pH changes, and changes 
in intracellular second messengers such as cGMP. In the assays of the invention, mammalian 

25 prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in 
vitro. For example, a prostate cancer polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, 
the prostate cancer polypeptide levels are determined in vitro by measuring the level of 

30 protein or mRNA. The level of protein is measured using immunoassays such as western 
blotting, ELISA and the like with an antibody that selectively binds to the prostate cancer 
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polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 
PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 
blotting, are preferred. The level of protein or mRNA is detected using directly or indirectly 
labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
5 radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer 
protein promoter operably linked to a reporter gene such as lucif erase, green fluorescent 
protein, CAT, or P-gal. The reporter construct is typically transfected into a cell. After 
treatment with a potential modulator, the amount of reporter gene transcription, translation, or 
10 activity is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
differentially expressed gene as important in a particular state, screening of modulators of the 
expression of the gene or the gene product itself can be done. The gene products of 
15 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins." 
The prostate cancer protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 
20 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 
25 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present. 
Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate 
30 cancer protein and a candidate compound, and determining the binding of the compound to 
the prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
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although other mammalian proteins may also be used, e.g. for the development of animal 
models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate 
5 cancer protein or the candidate agent is non-diffusably bound to an insoluble support having 
isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble 
supports may be made of any composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 
screening. The surface of such supports may be solid or porous and of any convenient shape. 

10 Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition is not crucial so long as it 

15 is compatible with the reagents and overall methods of the invention, maintains the activity of 
the composition and is nondiffusable. Preferred methods of binding include the use of 
antibodies (which do not sterically block either the ligand binding site or activation sequence 
when the protein is bound to the support), direct binding to "sticky" or ionic supports, 
chemical crosslinking, the synthesis of the protein or agent on the surface, etc. Following 

20 binding of the protein or agent, excess unbound material is removed by washing. The sample 
receiving areas may then be blocked through incubation with bovine serum albumin (BS A), 
casein or other innocuous protein or other moiety. 

In a preferred embodiment, the prostate cancer protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 

25 support and the prostate cancer protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 

30 protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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The determination of the binding of the test modulating compound to the 
prostate cancer protein may be done in a number of ways. In a preferred embodiment, the 
compound is labeled, and binding determined directly, e.g., by attaching all or a portion of 
the prostate cancer protein to a solid support, adding a labeled candidate agent (e.g., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support. Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the 
proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., 125 I for the proteins and a fluorophor 
10 for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by 
competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e., a prostate cancer protein), such as an antibody, peptide, binding partner, 

15 ligand, etc. Under certain circumstances, there may be competitive binding between the 

compound and the binding moiety, with the binding moiety displacing the compound. In one 
embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 

20 40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 
removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 

25 compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent. Alternatively, if the test compound is labeled, the 

30 presence of the label on the support indicates displacement. 
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In an alternative embodiment, the test compound is added first, with 
incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the prostate cancer protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 
5 support, coupled with a lack of competitor binding, may indicate that the test compound is 
capable of binding to the prostate cancer protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activity of the prostate cancer proteins. In 
this embodiment, the methods comprise combining a prostate cancer protein and a competitor 
10 in a first sample. A second sample comprises a test compound, a prostate cancer protein, and 
a competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
15 agent is capable of binding to the prostate cancer protein. 

Alternatively, differential screening is used to identify drug candidates that 
bind to the native prostate cancer protein, but cannot bind to modified prostate cancer 
proteins. The structure of the prostate cancer protein may be modeled, and used in rational 
drug design to synthesize agents that interact with that site. Drug candidates that affect the 
20 activity of a prostate cancer protein are also identified by screening drugs for the ability to 
either enhance or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 
25 protein. Following incubation, samples are washed free of non-specifically bound material 
and the amount of bound, generally labeled agent determined. For example, where a 
radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 
30 include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
to facilitate optimal protein-protein binding and/or reduce non-specific or background 
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interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
5 compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

10 In one aspect, the assays are evaluated in the presence or absence or previous 

or subsequent exposure of physiological signals, e.g. hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

15 In this way, compounds that modulate prostate cancer agents are identified. 

Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is 

20 provided. The method comprises administration of a prostate cancer inhibitor. In another 
embodiment, a method of inhibiting prostate cancer is provided. The method comprises 
administration of a prostate cancer inhibitor. In a further embodiment, methods of treating 
cells or individuals with prostate cancer are provided. The method comprises administration 
of a prostate cancer inhibitor. 

25 In one embodiment, a prostate cancer inhibitor is an antibody as discussed 

above. In another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 

30 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
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transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
5 modulators of prostate cancer sequences, which when expressed in host cells, inhibit 

abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 
eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft. 

Techniques for soft agar growth or colony f ormation in suspension assays are 
10 described in Freshney, Culture of Animal Cells a Manual of Basic Technique (3 rd ed., 1994), 
herein incorporated by reference. See also, the methods section of Garkavtsev et al (1996), 
supra, herein incorporated by reference. 

Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until 
15 they touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
20 pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 

saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a 
25 preferred method of measuring density limitation of growth. Transformed host cells are 
transfected with a prostate cancer-associated sequence and are grown for 24 hours at 
saturation density in non-limiting medium conditions. The percentage of cells labeling with 
( 3 H)-thymidine is determined autoradiographically. See, Freshney (1994), supra. 
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Growth factor or serum dependence 

Transformed cells have a lower serum dependence than their normal 
counterparts (see, e.g., Temin, J. Natl. Cancer InstL 37:167-175 (1966); Eagle et al, J. Exp. 
Med 131:836-879 (1970)); Freshney, supra. This is in part due to release of various growth 
5 factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 
Gullino, Angiogenesis, tumor vascularization, and potential interference with tumor growth. 
in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985)). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman, Angiogenesis and Cancer ; Sem Cancer Biol. (1992)). 

Various techniques which measure the release of these factors are described in 
Freshney (1994), supra. Also, see, Unkless et al , /. Biol Chem. 249:4295-4305 (1974); 
Strickland & Beers, J. Biol. Chem. 251:5694-5702 (1976); Whur et al, Br. J. Cancer 42:305- 
312 (1980); Gullino, Angiogenesis, tumor vascularization, and potential interference with 
tumor growth, in Biological Responses in Cancer, pp. 178-184 (Mihich (ed) 1985); 
Freshney Anticancer Res. 5: 1 1 1-130 (1985). 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel-or some other extracellular matrix 
25 constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent. Li this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
30 Techniques described in Freshney (1994), supra, can be used. Briefly, the 

level of invasion of host cells can be measured by using filters coated with Matrigel or some 
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other extracellular matrix constituent. Penetration into the gel, or through to the distal side of 
the filter, is rated as invasiveness, and rated histologically by number of cells and distance 
moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

5 

Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted. Knock- 

10 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

15 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

20 lesion (see, e.g. p Capecchi et al, Science 244:1288 (1989)). Chimeric targeted mice can be 
derived according to Hogan et al 9 Manipulating the Mouse Embryo: A Laboratory Manual, 
Cold Spring Harbor Laboratory (1988) and Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach, Robertson, ed., IRL Press, Washington, D.C., (1987). 

Alternatively, various immune-suppressed or immune-deficient host animals 

25 can be used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella et al, J. 
Natl. Cancer Inst. 52:921 (1974)), a SCID mouse, a thymectomized mouse, or an irradiated 
mouse (see, e.g., Bradley et al, Br. J. Cancer 38:263 (1978); Selby et al, Br. J. Cancer 
41:52 (1980)) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

30 normal cells of similar origin will not. Li hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
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suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Student's T test) are said to have inhibited growth. 

5 Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer 
sequences is correlated with prostate cancer. Accordingly, disorders based on mutant or 
variant prostate cancer genes may be determined. In one embodiment, the invention provides 
methods for identifying cells containing variant prostate cancer genes, e.g., determining all or 

10 part of the sequence of at least one endogenous prostate cancer genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the prostate cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one prostate cancer gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 

15 of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced prostate cancer gene to a known prostate cancer 
gene, i.e., a wild-type gene. 

The sequence of all or part of the prostate cancer gene can then be compared 
to the sequence of a known prostate cancer gene to determine if any differences exist. This 

20 can be done using any number of known homology programs, such as Bestfit, etc. In a 
preferred embodiment, the presence of a difference in the sequence between the prostate 
cancer gene of the patient and the known prostate cancer gene correlates with a disease state 
or a propensity for a disease state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to 

25 determine the number of copies of the prostate cancer gene in the genome. 

In another preferred embodiment, the prostate cancer genes are used as probes 
to determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 

30 cancer gene locus. 
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Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer 
protein or modulator thereof, is administered to a patient. By "therapeutically effective dose" 
herein is meant a dose that produces effects for which it is administered. The exact dose will 
5 depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques (e.g., Ansel et al, Pharmaceutical Dosage Forms and Drug 
Delivery; Lieberman, Pharmaceutical Dosage Forms (vols. 1-3, 1992), Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd, The Art, Science and 
Technology of Pharmaceutical Compounding (1999); and Pickar, Dosage Calculations 

10 (1999)). As is known in the art, adjustments for prostate cancer degradation, systemic versus 
localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction and the severity of the 
condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art. U.S. Patent Application N. 09/687,576, further discloses the use of 

15 compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 

A "patient" for the purposes of the present invention includes both humans 
and other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 

20 preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the prostate cancer proteins and modulators thereof of 
the present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 

25 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a prostate 
cancer protein in a form suitable for administration to a patient. In the preferred embodiment, 
the pharmaceutical compositions are in a water soluble form, such as being present as 

30 pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
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biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
5 acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 

methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 

10 potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

15 The pharmaceutical compositions may also include one or more of the 

following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit 

20 dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 

25 molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a prostate 
cancer protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an 

30 aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. 
These solutions are sterile and generally free of undesirable matter. These compositions may 
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be sterilized by conventional, well known sterilization techniques. The compositions may 
contain pharmaceutical^ acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 
5 sodium lactate and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., Remington's Pharmaceutical Science (15th ed M 1980) and Goodman & Gillman, 
The Pharmacologic*! Basis of Therapeutics (Hardman et a/.,eds., 1996)). 

10 Thus, a typical pharmaceutical composition for intravenous administration 

would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 

15 preparing parenterally administrable compositions will be known or apparent to those skilled 
in the art, e.g., Remington's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics* supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 

20 compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially arrest the disease and its complications. An 
amount adequate to accomplish this is defined as a "therapeutically effective dose.' 5 Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 

25 depending on the dosage and frequency as required and tolerated by the patient. In any event, 
the composition should provide a sufficient quantity of the agents of this invention to 
effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 

30 condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 
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treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 

It will be appreciated that the present prostate cancer protein-modulating 
5 compounds can be administered alone or in combination with additional prostate cancer 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables 1-16, such as antisense polynucleotides 

10 or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of prostate cancer-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 

15 expression of a protein or nucleic acid is application specific. Many procedures for 

introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell {see, 

20 e.g., Berger & Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 (Berger), Ausubel etal., eds., Current Protocols (supplemented through 1999), 
and Sambrook et al 9 Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 1-3, 1989. 

In a preferred embodiment, prostate cancer proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 

25 prostate cancer genes (including both the full-length sequence, partial sequences, or 

regulatory sequences of the prostate cancer coding regions) can be administered in a gene 
therapy application. These prostate cancer genes can include antisense applications, either as 
gene therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art. 

30 Prostate cancer polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses.. Such vaccine 
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compositions can include, e.g., lipidated peptides {see, e.g.,Vitiello, A. et al., J. Clin. Invest 
95:341 (1995)), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) CTLG") 
microspheres (see, e.g., Eldridge, et al, Molec. Immunol. 28:287-294, (1991); Alonso et al., 
Vaccine 12:299-306 (1994); Jones et al., Vaccine 13:675-681 (1995)), peptide compositions 
5 contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et al. , Nature 
344:873-875 (1990); Hu et al, Clin Exp Immunol. 113:235-243 (1998)), multiple antigen . 
peptide systems (MAPs) (see, e.g., Tarn, Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413 (1988); 
Tarn, J. Immunol. Methods 196:17-32 (1996)), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 

10 vectors (Perkus, et al, In: Concepts in vaccine development (Kaufmann, ed., p. 379, 1996); 
Chakrabarti, et al., Nature 320:535 (1986); Hu et al, Nature 320:537 (1986); Kieny, et al., 
AIDS Bio/Technology 4:790 (1986); Top et al, J. Infect. Dis. 124:148 (1971); Chanda et al, 
Virology 175:535 (1990)), particles of viral or synthetic origin (see, e.g., Kofler et al, J. 
Immunol. Methods. 192:25 (1996); Eldridge et al, Sem. Hematol. 30:16 (1993); Falo et al, 

15 Nature Med. 7:649 (1995)), adjuvants (Warren et al, Annu. Rev. Immunol. 4:369 (1986); 
Gupta et al., Vaccine 11:293 (1993)), liposomes (Reddy et al, J. Immunol. 148:1585 (1992); 
Rock, Immunol Today 17:131 (1996)), or, naked or particle absorbed cDNA (Ulmer, et al, 
Science 259:1745 (1993); Robinson et al., Vaccine 11:957 (1993); Shiver et al, In: Concepts 
in vaccine development (Kaufmann, ed., p. 423, 1996); Cease & Berzofsky, Annu. Rev. 

20 Immunol. 12:923 (1994) and Eldridge et al, Sem. Hematol 30:16 (1993)). Toxin-targeted 
delivery technologies, also known as receptor mediated targeting, such as those of Avant 
Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 

25 or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

30 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
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polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccines can be administered as nucleic acid compositions wherein DNA or 
5 RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a 
patient. This approach is described, for instance, in Wolff et. al, Science 247:1465 (1990) as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 

10 cationic lipid complexes, and particle-mediated ("gene gun*') or pressure-mediated delivery 
(see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 
invention can be expressed by viral or bacterial vectors. Examples of expression vectors 
include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 

15 vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 

20 described in Stover et al., Nature 351:456-460 (1991), A wide variety of other vectors useful 
for therapeutic administration or immunization e.g. adeno and adeno-associated virus vectors, 
retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the like, 
will be apparent to those skilled in the art from the description herein (see, e.g., Shata et al. t 
Mol Med Today 6:66-71 (2000); Shedlock et al 9 J Leukoc Biol 68:793-806 (2000); Hipp et 

25 aU In Vivo 14:571-85 (2000)). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing a prostate cancer gene or portion of a prostate cancer gene under the control of a 
regulatable promoter or a tissue-specific promoter for expression in a prostate cancer patient. 
The prostate cancer gene used for DNA vaccines can encode full-length prostate cancer 

30 proteins, but more preferably encodes portions of the prostate cancer proteins including 
peptides derived from the prostate cancer protein. In one embodiment, a patient is 
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immunized with a DNA vaccine comprising a plurality of nucleotide sequences derived from 
a prostate cancer gene. For example, prostate cancer-associated genes or sequence encoding 
subfragments of a prostate cancer protein are introduced into expression vectors and tested 
for their immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T 
5 cell responses. This procedure provides for production of cytotoxic T cell responses against 
cells which present antigen, including intracellular epitopes. 

In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the prostate cancer polypeptide encoded by the DNA 

10 vaccine. Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating 
animal models of prostate cancer. When the prostate cancer gene identified is repressed or 
diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 

15 models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g. as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 

20 cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate 
cancer. As such, transgenic animals can be generated that overexpiess the prostate cancer 
protein. Depending on the desired expression level, promoters of various strengths can be 
employed to express the transgene. Also, the number of copies of the integrated transgene 

25 can be determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

Kits for Use in Diagnostic and/or Prognostic Applications 

30 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
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may include any or all of the following: assay reagents, buffers, prostate cancer-specific 
nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 
molecules inhibitors of prostate cancer-associated sequences etc. A therapeutic product may 
5 include sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing directions 
(i.e., protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
medium capable of storing such instructions and communicating them to an end user is 
10 contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
materials. 

The present invention also provides for kits for screening for modulators of 
15 prostate cancer-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: a prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and 
instructions for testing prostate cancer-associated activity. Optionally, the kit contains 
biologically active prostate cancer protein, A wide variety of kits and components can be 
20 prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 
parameters in disease which may be identified in historical or outcome data. 

25 
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EXAMPLES 

Example 1: Tissue Preparation, Labeling Chips, and Fingerprints 

5 Purifying total RNA from tissue sample using TRIzol Reagent 

The sample weight is first estimated. The tissue samples are homogenized in 
1 ml of TRIzol per 50 mg of tissue using a homogenizer (e.g., Polytron 3100). The size of 
the generator/probe used depends upon the sample amount. A generator that is too large for 
the amount of tissue to be homogenized will cause a loss of sample and lower RNA yield. A 
10 larger generator (e.g., 20 mm) is suitable for tissue samples weighing more than 0.6 g. Fill 
tubes should not be overfilled. If the working volume is greater than 2 ml and no greater than 
10 ml, a 15 ml polypropylene tube (Falcon 2059) is suitable for homogenization. 

Tissues should be kept frozen until homogenized. The TRIzol is added 
directly to the frozen tissue before homogenization. Following homogenization, the insoluble 
15 material is removed from the homogenate by centrifugation at 7500 x g for 15 min. in a 

Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf centrifuge at 4°C. The cleared 
homogenate is then transferred to a new tube(s). Samples may be frozen and stored at -60 to 

-70°C for at least one month or else continue with the purification. 

The next process is phase separation. The homogenized samples are incubated 

20 for 5 minutes at room temperature. Then, 0.2 ml of chloroform per 1ml of TRIzol reagent is 
added to the homogenization mixture. The tubes are securely capped and shaken vigorously 
by hand (do not vortex) for 15 seconds. The samples are then incubated at room temp, for 
2-3 minutes and next centrifiiged at 6500 rpm in a Sorvall superspeed for 30 min. at 4oC. 

The next process is RNA Precipitation. The aqueous phase is transferred to a 

25 fresh tube; The organic phase can be saved if isolation of DNA or protein is desired. Then 
0.5 ml of isopropyl alcohol is added per 1ml of TRIzol reagent used in the original 
homogenization. Then, the tubes are securely capped and inverted to mix. The samples are 
then incubated at room temp, for 10 minutes an centrifuged at 6500 rpm in Sorvall for 20 

min.at4°C. 
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The RNA is then washed The supernatant is poured off and the pellet washed 
with cold 75% ethanol. 1 ml of 75% ethanol is used per 1 ml of the TRIzol reagent used in 
the initial homogenization. The tubes are capped securely and inverted several times to 
loosen pellet without vortexing . They are next centrifuged at <8000 rpm (<7500 x g) for 5 
5 minutes at 4°C. 

The RNA wash is decanted. The pellet is carefully transferred to an 
Eppendorf tube (sliding down the tube into the new tube by use of a pipet tip to help guide it 
in if necessary). Tube(s) sizes for precipitating the RNA depending on the working volumes. 
Larger tubes may take too long to dry. Dry pellet. The RNA is then resuspended in an 
10 appropriate volume (e.g., 2 -5 ug/ul) of DEPC H 2 0. The absorbance is then measured. 

The poly A+ mRNA may next be purified from total RNA by other methods 
such as Qiagen' s RNeasy kit. The poly A + mRNA is purified from total RNA by adding the 
oligotex suspension which has been heated to 37°C and mixing prior to adding to RNA. 
The Elution Buffer is incubated at 70°C. If there is precipitate in the buffer, warm up the 2 x 
15 Binding Buffer at 65°C. The the total RNA is mixed with DEPC-treated water, 2 x Binding 
Buffer, and Oligotex according to Table 2 on page 16 of the Oligotex Handbook and next 
incubated for 3 minutes at 65°C and 10 minutes at room temperature. 

The preparation is centrifuged for 2 minutes at 14,000 to 18,000 g, preferably, 
at a "soft setting," The supernatant is removed without disturbing Oligotex pellet. A little bit 
20 of solution can be left behind to reduce the loss of Oligotex. The supernatant is saved until 

satisfactory binding and elution of poly A + mRNA has been found 

Then, the preparation is gently resuspended in Wash Buffer OW2 and pipetted 
onto the spin column and centrifuged at full speed (soft setting if possible) for 1 minute. 

Next, the spin column is transferred to a new collection tube and gently 
25 resuspended in Wash Buffer OW2 and centrifuged as described herein. 

Then, the spin column is transferred to a new tube and eluted with 20 to 100 ul 
of preheated (70°C) Elution Buffer. The Oligotex resin is gently resuspended by pipetting up 
and down. The centrifugation is repeated as above and the elution repeated with fresh elution 
buffer or first eluate to keep the elution volume low. 
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The absorbance is next read to determine the yield, using diluted Elution 
Buffer as the blank. 

Before proceeding with cDNA synthesis, the mRNA is precipitated before 
proceeding with cDNA synthesis, as components leftover or in the Elution Buffer from the 
5 Oligotex purification procedure will inhibit downstream enzymatic reactions of the mRNA. 
0.4 vol. of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol is added and the preparation 
precipitated at -20°C 1 hour to overnight (or 20-30 min. at -70°C), and centrifuged at 
14,000-16,000 x g for 30 minutes at 4°C. Next, the pellet is washed with 0.5 ml of 80% 
ethanol (-20°C) and then centrifuged at 14,000-16,000 x g for 5 minutes at room temperature. 
10 The80% ethanol wash is then repeated. The last bit of ethanol from the pellet is then dried 
without use of a speed vacuum and the pellet is then resuspended in DEPC H 2 0 at lug/ul 
concentration. 

Alternatively the RNA may be purified using other methods (e.g., Oiagen's RNeasv kit). 

15 No more than 100 ug is added to the RNeasy column. The sample volume is 

adjusted to 100 ul with RNase-free water. 350 ul Buffer RLT and then 250 ul ethanol 
(100%) are added to the sample. The preparation is then mixed by pipetting and applied to an 
RNeasy mini spin column for centrifugation (15 sec at >10,000 rpm). If yield is low, reapply 
the flowthrough to the column and centrifuge again. 

20 Then, transfer column to a new 2 ml collection tube and add 500 ul Buffer 

RPE and centrifuge for 15 sec at >10,000 rpm. The flowthrough is discarded. 500 ul Buffer 
RPE and is then added and the preparation is centriuged for 15 sec at >10,000 rpm. The 
flowthrough is discarded, and the column membrane dried by centrifiiging for 2 min at 
maximum speed. The column is transferred to a new 1.5-ml collection tube. 30-50 ul of 

25 RNase-free water is applied directly onto column membrane. The column is then centrifuged 
for 1 min at >10,000 rpm and the elution step repeated. 

The absorbance is then read to determine yield. If necessary, the material may 
be ethanol precipitated with ammonium acetate and 2.5X volume 100% ethanol. 

30 First Strand cDNA Synthesis 
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The first strand can be make using using Gibco's "Superscript Choice System 
for cDNA Synthesis" kit. The starting material is 5 ug of total RNA or 1 ug of polyA+ 
mRNAl. For total RNA, 2 ul of Superscript RT is used; for polyA+ mRNA, 1 ul of 
Superscript RT is used. The final volume of first strand synthesis mix is 20 ul. The RNA 
5 should be in a volume no greater than 10 ul. The RNA is incubated with 1 ul of 100 pmol 
T7-T24 oligo for 10 min at 70°C followed by addition on ice of 7 ul of: 4ul 5X 1 st Strand 
Buffer, 2 ul of 0.1M DTT, and 1 ul of lOmM dNTP mix. The preparation is then incubated at 
37°C for 2 min before addition of the Superscript RT followed by incubation at 37°C for 1 
hour. 

10 

Second Strand Synthesis 

For the second strand synthesis, place 1st strand reactions on ice and add: 91 
ul DEPC H 2 0; 30 ul 5X 2nd Strand Buffer; 3 ul lOmM dNTP mix; 1 ul 10 U/ul E.coli DNA 
Ligase; 4 ul 10 U/ul E,coli DNA Polymerase; and 1 ul 2 U/ul RNase H. Mix and incubate 2 
15 hours at 16°C Add 2 ul T4 DNA Polymerase. Incubate 5 min at 16°C. Add 10 ul of 0.5M 
EDTA. 

Cleaning up cDNA 

The cDNA is purified using Phenol:Chloroform:Isoamyl Alcohol (25:24:1) 

20 and Phase-Lock gel tubes. The PLG tubes are centrifuged for 30 sec at maximum speed. 
The cDNA mix is then transferred to PLG tube. An equal volume of 
phenol:chloroform:isamyl alcohol is then added, the preparation shaken vigorously (no 
vortexing), and centrifuged for 5 minutes at maximum speed. The top aqueous solution is 
transferred to a new tube and ethanol precipitated by adciing 7.5X 5M NH40Ac and 2.5X 

25 volume of 100% ethanol. Next, it is centrifuged immediately at room temperature for 20 
min, maximum speed. The supernatant is removed, and the pellet washed with 2X with cold 
80% ethanol. As much ethanol wash as possible should be removed before air drying the 
pellet; and resuspending it in 3 ul RNase-free water. 
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In vitro Transcription (JYT) and labeling with biotin 

In vitro Transcription (IVT) and labeling with biotin is performed as follows: 
Pipet 1.5 ul of cDNA into a thin-wall PCR tube. Make NTP labeling mix by combining 2 ul 
T7 lOxATP (75 mM) (Ambion); 2 ul T7 lOxGTP (75 mM) (Ambion); 1.5 ul T7 lOxCTP (75 
5 mM) (Ambion); 1.5 ul T7 lOxUTP (75 mM) (Ambion); 3.75 ul 10 mM Bio-ll-UTP 

(Boehringer-Mannheim/Roche or Enzo); 3.75 ul 10 mM Bio-16-CTP (Enzo); 2 ul lOx T7 
transcription buffer (Ambion); and 2 ul lOx T7 enzyme mix (Ambion). The final volume is 
20 ul. Incubate 6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 
Clean-up follows the previous instructions for RNeasy columns or Qiagen's RNeasy protocol 

10 handbook. The cRNA often needs to be ethanol precipitated by resuspension in a volume 
compatible with the fragmentation step. 

Fragmentation is performed as follows. 15 ug of labeled RNA is usually 
fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right. Do not go higher than 20 ul because the magnesium in 

15 the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 
buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 

20 of the transcript size range. 

For hybridization, 200 ul (10 ug cRNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50 ng/ul final cone); 50 pM 948-b control 

25 oligo; 1.5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0.1 mg/ml herring sperm DNA; 
0.5 mg/ml acetylated BSA; and 300 ul with lxMES hyb buffer. 

The hybridization reaction is conducted with non-biotinylated IVT (purified 
by RNeasy columns) (see example 1 for steps from tissue to IVT): The following mixture is 
prepared: 
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IVT antisense RNA; 4 pg: pi 
Random Hexamers (1 pg/pl): 4 pi 
H 2 0: ul 

14 pi 

5 Incubate the above 14 pi mixture at 70°C for 10 min.; then put on ice. 

The Reverse transcription procedure uses the following mixture: 
0.1MDTT: 3 pi 

50X dNTP mix: 0.6 pi 

H 2 0: 2.4 pi 

10 Cy3 or Cy5 dUTP (lmM): 3 pi 

SS RT II (BRL): 1 pi 



16 pi 

The above solution is added to the hybridization reaction and incubated for 30 min., 42°C. 
15 Then, 1 pi SSII is added and incubated for another hour before being placed on ice. 

The 50X dNTP mix contains 25mM of cold dATP, dCTP, and dGTP, lOmM 

of dTTP and is made by adding 25 pi each of lOOmM dATP, dCTP, and dGTP; 10 pi of 

lOOmM dTTP to 15 pi H 2 0. ] 

RNA degradation is performed as follows. Add 86 pi H20, 1.5 pi 1M NaOH/ 
20 2 mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 pi TE/sample spin at 7000 g 

for 10 min, save flow through for purification. For Qiagen purification, suspend u-con 

recovered material in 500 pi buffer PB and proceed using Qiagen protocol. For DNAse 

digestion, add 1 ul of 1/100 dilution of DNAse/30 ul Rx and incubate at 37°C for 15 min. 

Incubate at 5 min 95°C to denature the DNAse. 

25 

Sample preparation 

For sample preparation, add Cot-1 DNA, 10 pi; 50X dNTPs, 1 pi; 20X SSC, 
2.3 pi; Na pyro phosphate, 7.5 pi; 10 mg/ml Herring sperm DNA; 1 ul of 1/10 dilution to 
21.8 final vol. Dry in speed vac. Resuspend in 15 pi H 2 0. Add 0.38 pi 10% SDS. Heat 
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95°C, 2 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 
64°C. Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 ml 20X 
SSC+0.75ml 10% SDS in 250ml H 2 0; IX SSC: 5 min,, 12.5 ml 20X SSC in 250ml H 2 0; 
0.2X SSC: 5 min., 2.5 ml 20X SSC in 250ml H 2 0. Dry slides and scan at appropiate PMT's 
5 and channels. 

Example 2: Taxol resistant Xenograft Model of Human Prostate Cancer 

Treatment regimens that include paclitaxel (Taxol; Bristol-Myers Squibb 
10 Company, Princeton, NT) have been particularly successful in treating hormone-refractory 
prostate cancer in the phase II setting (Smith et al., Semin. Oncol. 26(1 Suppl 2):109-11 
(1999)). However, many patients develop tumors which are initially, or later become, 
resistant to taxol. To identify genes that may be involved with resistance to taxol, or are 
regulated in response to taxol resistance, and therefore may be used to treat, or identify, taxol 
15 resistance in patients, the following experiments were carried out. 

The androgen-independent human cell line CWR22R was grown as a 
xenograft in nude mice (Nagabhushan et al., Cancer Res. 56(13):3042-3046 (1996); Agus et 
al., J. Natl. Cancer Inst.91(21): 1869-1876 (1999); Bubendorf et aL, J. Natl. Cancer Inst. 

20 91(20):1758-1764 (1999)). Initially, these xenograft tumors were sensitive to therapeutic 
doses of taxol. The mice were treated continuously with sub-therapeutic doses, and the 
tumors were allowed to grow for 3-4 weeks, before surgical removal of the tumors. The 
tumor from an individual mouse was then minced, and a small portion was then injected into 
a healthy nude mouse, establishing a second 

25 passage of the tumor. This mouse was then treated continuously with the 

same sub-therapeutic dose of taxol. This process was repeated 14 times, and a portion of 
each generation of xenograft tumor was collected. There was increasing resistance to 
therapeutic doses of taxol with each generation. Bythe end of the process, the tumors were 
fully resistant to therapeutic doses of taxol. RNA from each generation of tumor was then 

30 isolated, and individual mRNA species were quantified using a custom Affymetrix 

GeneChip® oligonucleotide microarray, with probes to interrogate approximately 35,000 
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unique mRNA transcripts. Genes were selected that showed a statistically significant up- 
regulation, or down-regulation, during the subsequent generations of increasingly taxol- 
resistant tumors. Only one gene was significantly up-regulated, whereas 24 genes were 
down-regulated; these are presented in Table 10. 
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The gene sequences identified to be overexpressed in prostate cancer may be 
used to identify coding regions from the public DNA database. The sequences may be used 
to either identify genes that encode known proteins, or they may be used to predict the coding 
regions from genomic DNA using exon prediction algorithms, such as FGENESH (Salamov 
and Solovyev, 2000, Genome Res. 10:516-522). In addition, one of ordinary skill in the art 
would understand how to obtain the unigene cluster identification and sequence information 
according to the exemplar accession numbers provided in Tables 1-16. (see, 
http://www.ncbi.nlm.nih.gov/UniGene/). 
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TABLE1 : shows genes, including expression sequence tags, differentially expressed in 
prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos HuOl 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Pkey: 

ExAccn: 

UmgenetD: 



R1: 



Unique Eos probeset identifier number 

Exemplar Accession number, Genbank accession number 

Unfgene number 

Unkjene gene title 

Ratio of tumor to normal body tissue 



Pkey 


UnigenelD ExAccn 


Uningene Title 


R1 


131919 


Hs.272458 AA121266 


ESTs 


37.2 


120328 


HS290905 AA196979 


ESTs; Weakly similar to (deffine not ava 


32.6 


105201 


Hs31412 AA195626 


ESTs 


30.1 


101486 


Hs.1852 M24902 


acid phosphatase; prostate 


25.2 


119073 


HS279477 R32894 


ESTs 


24.8 


133428 


Hs.183752 M34376 


microseminoprotein; beta- 


23.8 


128180 


Hs.171995 AA595348 


kallikrein 3; (prostate specific antigen 


21.4 


104080 


Hs.57771 AA402971 


Homo sapiens mRNA for serine protease (T 


18.9 


127537 


Hs.162859 AA569531 


ESTs 


18.6 


131665 


HJL30343 R22139 


ESTs 


17.4 


101050 


Hs.1832 K01911 


neuropeptide Y 


17.3 


130771 


Hs.1915 N48056 


folate hydrolase (prostate-specific memb 


17 


108153 


Hs.40808 AA054237 


ESTs 


16.9 


107485 


Hs.262476 W63793 


S-adenosylmethionine decarboxylase 1 


16.7 


106155 


Hs.33287 AA425309 


ESTs 


165 


129534 


Hs.11260 R73640 


ESTs 


16.4 


100569 


Hs.171995 HG2261-HT2351 


Antigen, Prostate Specific, Alt Splice 


16 


101889 


Hs.181350 S39329 


kallikrein 2; prostatic 


15.4 


135389 


H&99872 U05237 


fetal Alzheimer antigen 


15 


101506 


Hs.62192 M27436 


coagulation factor 111 (thromboplastin; 


4 O ft 

13.9 


134374 


Ks.8236 D62633 


ESTs 


127 


133944 


Hs.7780 AA045870 


ESTs 


125 


109141 


Hs.193380 M176428 


ESTs 


123 


130974 


Hs.2178 X57985 


H2B histone family; member Q 


11.8 


114768 


Hs.182339 M149007 


ESTs 


11.8 


104394 


Hs.172129 H46617 


yp19h1.fi Scares breast 3NbHBst Homo sap 


11.8 


125299 


Hs.102720 Z39436 


ESTs 


11.6 


104660 


Hs.14846 AA007160 


ESTs 


11.4 


100116 


Hs.78045 000654 


actin; gamma 2; smooth muscle; enteric 


11 


131061 


Hs.268744 N64328 


ESTs; Moderately similar to KIAA0273(H. 


10.9 


126645 


126645 AI167942 


Homo sapiens BAG done RG041D1 1 from 7q2 


10.7 


135153 


Hs.95420 N40141 


Homo sapiens mRNA for JM27 protein; comp 


10.6 


107033 


Hs.1 13314 AA599629 


ESTs 


- 10.6 


118417 


N66048 


ESTs; Weakly similar to polymerase [H.sa 


105 


126758 


Hs.293960 W37145 


ESTs 


10.2 


115674 


Hs.8364 AA406542 


ESTs 


10.1 


134989 


Hs.92381 AA236324 


ESTs; Weakly similar to 111! ALU CLASS A 


10.1 


107102 


Hs.30652 M609723 


ESTs 


10.1 


116787 


Hs.15641 H28581 


ESTs 


10.1 


115719 


Hs59622 AA416997 


ESTs 


10 


123209 


Hs.203270 AM89711 


ESTs 


9.9 


101664 


Hs.121017 M60752 


H2A histone family; member A 


93 


112971 


Hs.83883 T17185 


ESTs 


9.7 


102519 


Hs.80296 U52969 


Purkinje cell protein 4 


9.7 


117984 


Hs.106778 N51919 • 


ESTs 


9.7 


105840 


H&22209 AA398533 


ESTs 


9.4 


129523 


H&274509 M30894 


T-ceii receptor; gamma cluster 


9.4 


132964 


Hs.167133 AA031360 


ESTs 


9.2 


121853 


Hs.98502 AA425887 


ESTs 


9 
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115764 Hs.91011 AA421562 anterior gradient 2 (Xenopus laevis; sec 8.9 

119617 Hs55999 W47380 ESTs 8.9 

100552 Hs.301946 HG2167-HT2237 Protein Kinase Ht31, Camp-Dependent 8.9 

105627 Hs.23317 AA281245 ESTs 8.8 

5 101461 Hs.76422 M22430 phospholipase A2; group llA (platelets; 8.7 

131725 Hs.31146 AA456264 ESTs; Highly similar to (defline not ava 8.5 

124526 Hs.293185 N62096 yz61c5^1 Soares_multiple_sderosis_2NbH 8.5 

118528 Hs.49397 N67889 ESTs 8.2 

133845 Hs.76704 T68510 ESTs 8.2 

10 133354 Hs.334762 AA055552 ESTs; WeaWy simflar to KIAA0319 [H.sapi 8.1 

105912 Hs.20415 AA402000 ESTs; Weakly similar to GS3786 [H^apien 8 

119018 HS278695 N95796 ESTs 8 

100394 Hs.66052 D84276 CD38 antigen (p45) 8 

114132 Hs.24192 Z38688 ESTs 7.9 

15 116786 Hs.301527 H25836 tumor necrosis factor (ligand) superfami 7.7 

106579 Hs23023 AA456135 ESTs 7.6 

128790 Hs.105700 AA291725 secreted frizzled-related protein 4 7.5 

114965 Hs.72472 AA250737 ESTs 7.4 

112033 Hs.22627 R43162 ESTs 7.1 

20 102398 U42359 Human N33 protein form 1 (N33) gene, exo 7 

101201 H&2256 L22524 matrix metaBoproteinase 7 (matrilysin; 6.9 

109272 Hs.288462 AA195718 ESTs 6.9 

103145 Hs.169849 X66276 myosin-binding protein C; slow-type 6.9 

101803 Hs.155691 M86546 pre-B-cell leukemia transcription factor 6.8 

25 120562 Hs.302267 AA280036 ESTs; Weakly similar to W01 A6.C [Celega 6.8 

109112 Hs257924 AA169379 ESTs 6.8 

109795 H<L326416 F10707 ESTs 6.7 

107532 Hs.173684 Z19643 ESTs; Weakly similar to (defline not ava 6.7 

130336 Hs.171995 X07730 kallikrein 3; (prostate specific antigen 6.6 

30 131425 HS26691 AA219134 ESTs 6.6 

120588 Hs.16193 AA281591 Homo sapiens mRNA; cDNA DKFZp586B21 1 (fr 6.6 

132902 Hs.59838 AA490969 ESTs 6.6 

125674 Hs.323378 W28078 H.sapiens mRNA for transmembrane protein 6.6 

133724 Hs.75746 U07919 aldehyde dehydrogenase 6 65 

35 130343 Hs278628 AA490262 ESTs; Moderately similar to APXL gene pr 65 

120215 Hs.108787 Z41050 Homo sapiens Mcd4p homolog mRNA; complet 65 

129215 Hs.126085 AA176867 ESTs 6.5 

131881 Hs.3383 AA010163 upstream regulatory element binding prot 6.5 

133376 Hs.7232 T23670 ESTs 6.4 

40 105376 Hs.8768 AA236559 ESTs; WeaWy similar to neuronal thread 6 A 

104674 Hs26289 AA009527 ESTs 6.4 

100727 Hs.334786 X07290 Human HF.12 gene mRNA 6.3 

130150 Hs.15113 AF000573 homogentisate 12-dioxygenase (homogenti 6.3 

121770 Hs.278428 AA421714 Homo sapiens mRNA for KIAA0896 protein; 6.3 

45 123475 HS-250528 AA599267 ESTs; Weakly similar to ANKYRIN; BRAIN V 6.3 

133061 Hs.296638 AB000584 prostate differentiation factor 6.3 

116429 Hs.279923 AA609710 ESTs; Weakly simflar to similar to GTP-b 6.2 

101233 Hs.878 L29008 sorbitol dehydrogenase 62 

104691 Hs.37744 AA011176 ESTs 62 

50 127248 AA325029 EST27953 CerebeHum (I Homo sapiens cDNA 62 

127775 Hs.179902 H04106 ESTs; Weakly similar to (defline not ava 62 

105500 HS222399 AA256485 ESTs 6.1 

131463 Hs.2714 X74142 forkhead (Drosophila)-like 1 - 6.1 

132116 Hs.40289 AA234767 ESTs 6 

55 130828 HS203213 AA053400 ESTs 5.9 

115357 Hs.72988 AA281793 ESTs 5.8 

105496 Hs.301997 AA256323 ESTs 5.7 

116334 Hs.48948 AA491457 ESTs 5.7 

107968 Hs.61539 AA034O20 ESTs 5.7 

60 120132 Hs.125019 Z38839 ESTs; Weakly similar to HI! ALU SUBFAMI 5.6 

106375 Hs.289072 AA443993 ESTs 5.6 

132550 Hs.170195 AA029597 bone morphogenetic protein 7 (osteogenic 5.6 

124777 Hs.140237 R41933 ESTs; WeaWy simflar to neuronal thread 5.6 

100311 Hs.337616 D50640 phosphodiesterase 3B; cGMP-inhibited 5.6 

65 101791 Hs.62354 M83822 Human beige-like protein (BGL) mRNA; par 5.5 

117698 Hs.45107 N41002 ESTs 5.5 

132387 Hs281434 R70914 heat shock 70kD protein 1 5.5 

122041 Hs.98732 AA431407 Homo sapiens Chromosome 16 BAC clone CIT 55 

133723 Hs262476 AA088851 S-adenosylmethlonine decarboxylase 1 55 
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113938 W81598 ESTs 5.4 

133015 Hs.246315 AA047036 ESTs 5.4 

125745 Hs.75722 AI283493 ribophorinll 5.4 

107295 Hs.80120 T34527 UDP-N-acetyl-alpha-0-flalactosamine:polyp 5.4 

5 108186 Hs.7780 AA056482 ESTs 5.3 

100184 Hs.21223 D17408 calponln 1; basic; smooth muscle 5.3 

104466 Hs.326392 N25110 Human guanine nucleotide exchange factor 5.3 

104033 Hs.98944 AA365031 ESTs 5,3 

110844 Hs.167531 N31952 ESTs; Weakly similar to (deflfne not ava 5.3 

10 129056 Hs.108336 H70627 ESTs; Weakly similar to III! ALU SUBFAMl 5.3 

102805 Hs.25351 U90304 iroquois-ciass homeodomain protein 5.3 

133493 Hs.194369 AA284143 Homo sapiens chromosome 1 atrophin-1 rel 5.3 

129184 Hs.109201 W26769 ESTs; Highly similar to (deffine not ava 52 

134158 Hs.79428 U15174 BCL2/adenovirus E18 19kD-intsracting pro 52 

15 107240 Hs.159872 D59368 ESTs S2 

104787 AA027317 ESTs; Weakly similar to UH ALU SUBFAMl 52 

123527 Hs.1 08327 AA608679 damage-specific DN A binding protein 1 (1 S2 

116646 Hs.194228 F03048 ESTs; Moderatefy similar to UH ALU SUB 52 

101448 Hs.195850 M21389 keratin 5 (epidermolysis bullosa simplex 5.1 

20 1 16188 Hs.184598 AA464728 ESTs; Weakly similar to !!!! ALU SUBFAMl 5.1 

126259 Hs.281428 Z21472 ESTs; Moderately similar to !!!! ALU SUB 5.1 

105921 Hs.169119 AA402613 ESTs 6.1 

103375 Hs.54416 X91868 sine ocufis homeobox (Drasophfla) homoto 5.1 

128871 Hs.106778 AA4O0271 ESTs; Highly similar to (deffine not ava 5.1 

25 112681 Hs.148932 R87331 ESTs; Moderately similar to semaphore V 5.1 

105784 Hs.226434 AA350771 ESTs 5.1 

116238 Hs.47144 AA479362 ESTs 5 

102913 Hs.80342 X07696 kera!in15 5 

103011 Hs.326035 X52541 early growth response 1 5 

30 126023 H58881 yr36d09.rl Soares fetal liver spleen 1NF ' 5 

103709 Hs.13804 AA037316 ESTs 5 

118981 Hs.39288 N93839 ESTs; Weakly similar to !!!! ALU SUBFAMl 5 

134807 Hs.89732 X76932 zinc finger protein 273 5 

100079 Hs.23311 AB0Q2365 Human mRNA tor Kl AA0367 gene; partial cd 4.9 

35 132047 Hs.3796 083492 EphB6 4.9 

132880 Hs.177537 AA444369 ESTs 4.9 

124049 Hs.74519 F10523 primase; polypeptide 2A (58kD) 4.8 

133330 Hs.71119 U42360 Human N33 mRNA; complete cds 4.8 

104776 AA026349 ESTs 4.8 

40 122593 Hs.128749 AA453310 Homo sapiens afpha-methylacyl-CoA racema 4.8 

103912 Hs.143087 AA251078 ESTs 4.8 

113961 Hs.26009 W86307 Homo sapiens mRNA for KIAAO860 protein; 43 

105288 Hs.3585 AA233168 ESTs; Weakly similar to coded for by C. 45 

135035 Hs.284186 H89575 ESTs 43 

45 104144 Hs.183390 AA447439 ESTs; WeaWy similar to ZINC FINGER PROT 43 

129389 Hs.288126 AA621604 ESTs 43 

125982 R98091 RAE1 (RNA export 1; S.pombe) homolog 43 

125162 Hs.26243 W44682 ESTs 43 

103023 Hs.1 17950 X53793 multifunctional polypeptide similar to S 4.7 

50 129735 W80701 ESTs; Weakly similar to HERV-E envelope 4.7 

104479 Hs.106390 N36040 ESTs 4.7 

103731 AA070545 zm7c3j1 Stratagene neuroepitheBum (#93 4.7 

126575 Hs.127602 W72416 ESTs - 4.7 

124578 Hs.231500 N66321 Human glucose transporter-like protein-l 4.7 

55 130617 Hs.1674 M90516 glutamine-fructose-6-phosphatetransamin 4.7 

116752 Hs.91622 K06373 Homo sapiens clone 24456 mRNA sequence 4.7 

100279 Hs32007 D42084 Human mRNA for KIAA0094 gene; partial cd 4.7 

126288 Hs.89576 AI479264 ESTs 4.7 

131836 Hs.32990 AA610086 ESTs 4.7 

60 106717 Hs.239489 AA465093 TIA1 cytotoxic granule-assodated RNA-bi 4.7 

114542 HS31011 AA055768 ESTs 4.6 

103806 AA130614 zolfZrl Stratagene neuroepithe{iumNT2R 4.6 

130529 AA173238 smaO inducible cytokine A5 (RANTES) 4.6 

115675 Hs.82065 AA406546 ESTs 4.6 

65 111386 Hs.293798 N95326 ESTs 4.6 

106503 Hs2^79 AA452411 ESTs 4.6 

119943 Hs.14158 W86835 copine III 4.6 

104459 Hs.100070 M91493 EST 4.6 

100774 Hs.89603 HG371-HT1063 Mucin 1 , Epithelial, All Splice 6 4.6 
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100652 Hs.142653 HG2825-HT2949 Ret Transforming Gene 4.6 

132015 Hs.3731 D11900 ESTs 4.6 

126086 H70975 yr73g01 j1 Soares fetal liver spleen 1NF 4.6 

130888 Hs.173094 F03819 ESTs 4.6 

5 106390 Hs20166 AA446964 Prostate stem ceil antigen 4.6 

126959 AA1 99853 ESTs; Moderately similar to HI! ALU SUB 4.5 

131584 Hs.29117 X91648 Ksapiens mRNA for pur alpha extended 3' 4.5 

104838 Hs20953 AA039481 ESTs 4.5 

125661 R50319 ESTs 45 

10 103171 Hs234726 X68733 alpha-1-anrJchymotrypsin 4.5 

103928 Hs.199160 AA280085 ESTs 4.5 

102899 Hs.75730 X06272 signal recognition particle receptor f d 4.5 

100892 Hs.180789 HG4557-HT4962 Small Nuclear Rtoonucleoprotein U1, 1snr 4.5 

106167 Hs.7956 AA425906 ESTs 4.5 

15 129404 Hs.317584 AA172056 ESTs 4.5 

106990 Hs24758 AA521354 ESTs 45 

132316 Hs.44566 U28831 Human protein immuno-reactive with anti- 4.4 

132056 Hs.38176 T89386 Homo sapiens mRNA for KIAA0606 protein; 4.4 

133718 Hs.198760 X15306 neurofilament; heavy polypeptide (200kD) 4.4 

20 101470 Hs.1846 M22898 tumor protein p53 (U-Fraurneni syndrome) 4.4 

131904 Hs284296 AA143019 ESTs; Highly similar to surface 4 integr 4.4 

105804 Hs22514 AA383142 ESTs 4.4 

122861 Hs.1 19394 AA464428 ESTs 4.4 

111336 Hs29894 N79565 ESTs 4.4 

25 121944 Hs.98518 AM29278 ESTs 4.4 

134401 Hs211577 AA243746 ESTs; Highly similar to CGI protein [H.s 4.4 

126458 Hs288969 AA815252 ESTs; Weakly similar to 111! ALU SUBFAMl 4.4 

133435 Hs.323966 T23983 ESTs; Moderately similar to 1111 ALU SUB 4.4 

105178 Hs21941 AA187490 ESTs 4.3 

30 127315 AA640834 nr27b06Jl NCI_CGAP_Pr3 Homo sapiens cDN 4.3 

132645 Hs.54424 X87870 H.saptens mRNA for hepatocyte nuclear fa 4.3 

116162 Hs282990 AA461487 ESTs; WeaWy similar to F52C122 [Celeg 4.3 

118040 Hs.47567 N52876 EST 4.3 

130008 Hs278427 M31423 cerebellar degeneration-related protein 4.3 

35 126607 Hs.114688 W87424 ESTs 4.3 

123061 Hs.105130 AA482030 EST 4.3 

109391 Hs.184245 AA219699 ESTs 4.3 

109175 AA180496 ESTs 4.3 

127003 Hs.173540 AA550806 ESTs; Weakly similar to (defiine not ava 4.3 

40 102547 Hs.46638 U57911 chromosome 11 open reading frame 8 A3 

134208 Hs.79993 U88871 peroxisomal biogenesis factor 7 4.3 

104258 Hs.5462 AF007216 solute carrier family 4; sodium bicarbon 4.3 

130759 Hs.18946 AA094720 ESTs; Weakly similar to (defiine not ava 4.3 

132160 Hs295923 AA281770 seven in absentia (DrosophOa) homolog 1 4.3 

45 135062 Hs.93872 AA174183 ESTs 4.3 

126510 Hs.334762 R49702 ESTs; Weakly similar to KIAA031 9 [Hxapi 42 

122055 Hs.98747 AA431732 EST 42 

133136 Hs.6574 AF007165 suppressin (nuclear deformed epidermal a 42 

109890 Hs20843 H04649 ESTs 42 

50 133294 Hs.69997 R79723 Haptens mRNA for translin associated z 42 

134436 Hs.83190 S80437 fatty acid synthase (3 1 region} [human, 42 

107375 Hs251064 U88573 NBR2 42 
122223 HS27413 AA436158 ESTs - 42 
103044 Hs248210 X55777 H .sapiens Mahlavu hepatocellular carcino 42 

55 120125 Hs.59815 W99362 EST 42 

128969 Hs283978 T65327 ESTs; Highly s&nilar to (defiine not ava 42 

129637 Hs.1179 D90359 TATA box binding protein (TBP)-associate 42 

106566 AA455921 ESTs; Weakly similar to till ALU SUBFAMl 42 

112605 Hs29852 R79220 ESTs 42 

60 103364 Hs279929 X90872 Ksapiens mRNA for gp25L2 protein 42 

132811 Hs.57419 U25435 transcriptional repressor 42 

126570 Hs326292 T79274 ESTs 42 

116298 Hs.94109 AA489046 ESTs 42 

103024 Hs.105938 X53961 lactotransferrin 4.1 

65 129133 Hs.108850 R56728 yg95c6.r1 Soares infant brain 1 NIB Homo 4.1 

133167 Hs.6641 N98707 kinesin family member 5C 4.1 

126871 Hs.14051 AA351779 ESTs 4.1 

132333 Hs.45032 AA192157 ESTs 4.1 

107376 Hs.327179 U90545 solute carrier family 17 (sodium phospha 4.1 



103 



WO 02/30268 



PCT/US01/32045 



128517 Hs.100861 AA280617 ESTs;WeaMy similar to p60katanin[H.s 4.1 

130555 Hs.116774 AA450324 ESTs 4.1 

105765 Hs.24183 AA343514 ESTs 4.1 

126529 Hs^6369 M133237 ESTs 4.1 

5 125928 Hs.181889 H29730 ESTs 4.1 

117280 Hs.172129 N22107 ESTs; Moderately similar to III! ALU SUB 4.1 

100234 Hs.3085 029677 KIM0054 gene product * 4.1 

100959 Hs.118127 J00073 actin; alpha; cardiac muscle 4.1 

107130 Hs.12913 AA620582 ESTs; WeaWy similar to (defline not ava 4.1 

10 105035 Hs.8859 AA128486 ESTs 4.1 

126735 Hs.226795 AA808949 glutathione S-transferase pi 4.1 

113056 Hs.8036 T26471 ESTs; Moderately similar to !!!! ALU SUB 4 

102460 Hs.211582 U48959 Homo sapiens myosin light chain kinase ( 4 

106968 Hs.26813 AA504631 ESTs; Weakly similar to (defline not ava 4 

15 123107 Hs.104207 AA486071 ESTs 4 

127256 Hs.267967 AA327550 ESTs; Weakly similar to !!H ALU SUBFAMI 4 

105329 Hs.22862 AA234561 ESTs 4 

115504 Hs.42736 AA291946 ESTs 4 

120726 Hs.97293 AA293656 ESTs 4 

20 103576 Hs34560 Z26317 desmoglein2 4 

127889 Hs.144941 A1147408 ESTs 4 

106394 Hs.25320 AA447223 ESTs 4 

128046 AA873285 ESTs 4 

103391 Hs.1 14366 X94453 pyrrotine-5-carboxylate synthetase (glut 4 

25 106448 Hs.27004 AA449455 ESTs 4 

126513 Hs.86276 W27601 ESTs; Moderately similar to (defline not 4 

129593 Hs.98314 AA487015 ESTs; Weakly similar to HI! ALU SUBFAMI 3.9 

110151 Hs31608 H18836 ESTs 3.9 

105344 Hs.8645 AA235303 ESTs 3.9 

30 104791 Hs.301871 AA029046 ESTs 3.9 

123442 Hs.111496 AA598803 ESTs 3.9 

127800 Hs.79428 AA521047 BCL2/adenovirus E1B 19kD-interacting pro 3.9 

114555 Hs.167904 AA058594 ESTs 3.9 

122138 Hs.163960 AA4355 49 ESTs 3.9 

35 129565 Hs.198726 X77777 vasoactive intestinal peptide receptor 1 3.9 

103471 Hs.75216 Y00815 protein tyrosine phosphatase; receptor t 3.9 

133908 Hs.325474 M83216 caldesmonl 3.9 

105635 Hs301985 AA281508 ESTs 3.9 

134285 Hs.81086 AA460012 solute carrier family 22 (organic cation 3.9 

40 134125 Hs.50421 R381Q2 KIAA0203 gene product 3.9 

125628 Hs241493 AA418069 natural killer-tumor recognition sequenc 3.9 

103695 Hs.186600 AA018758 ESTs 3.9 

100642 Hs.182183 HG2743-HT3926 Caldesmonl, Ah. Splice 6, Non-Muscle 3.9 

104334 Hs.78771 D82614 ESTs 35 

45 110242 Hs.19978 H26417 ESTs 3.9 

125298 Hs.289008 Z39255 ESTs 3.9 

104060 Hs.303193 AA397968 zt87a9.r1 $oaresJestis_NHT Homo sapiens 3.9 

105823 Hs293960 AA398197 ESTs 3.9 

126499 Hs.110445 AA315671 ESTs; Moderately simBar to unknown (M.m 3.9 

50 130752 Hs.18895 D50927 KIAA0137 gene product 3.8 

123494 Hs.112110 AA599786 ESTs 3.8 

104846 Hs.32478 AA040154 ESTs 3.8 

108921 Hs.71721 AA142913 ESTs - 3.8 

115506 Hs.45207 AA292537 ESTs 3.8 

55 100452 Hs.241552 D87742 Human mRNA for KIAAQ268 gene; partial cd 3.8 

104454 Hs.129228 M84443 galactokinase 2 3.8 

108730 Hs.102859 AA126254 ESTs 3.8 

131223 Hs.24427 AA247788 ESTs; Highly similar to (defline not ava 3.8 

104784 Hs.269228 AA027055 ESTs 3.8 

60 104946 Hs.73848 AA069549 ESTs 3.8 

106932 Hs.9394 AA495926 ESTs 3.8 

101724 Hs.620 M69225 bullous pemphigoid antigen 1 (230/240kD) 3.8 

106140 Hs.14912 AA424524 Homo sapiens mRNA for KIAA0286 gene; par 3.8 

126135 HS269721 AA913491 ESTs 3.8 

65 120030 Hs.58694 W92051 ESTs 33 

126457 Hs.50382 AA007489 zh98g04.r1 SoaresJetaLKver_spleenJNF 33 

123917 Hs.1 12969 AA621311 EST 3.7 

110714 Hs.17752 H95978 Homo sapiens phosphatidyfeerfne-specific 3.7 

130577 Hs.162 M35410 hsuWke growth factor binding prote 3.7 
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102663 Ks.168075 U70322 karyopherin flmportin) beta 2 3.5 

126349 Hs.13531 AA442858 ESTs; Weakly similar to (defline not ava 3.5 

132154 Hs.41119 N67179 ESTs 3.5 

131689 Hs.30696 AA599653 transcription factor-like 5 (basic helix 3.5 

5 127862 Hs.163191 AA765305 EST 3.5 

126995 Hs.189810 W26950 Human DNA sequence from PAC 388M5 on chr 3.5 

119071 H31180 ESTs 3.5 

103941 Hs.96593 AA282978 ESTs 3.5 

110721 Hs.31319 H97678 . ESTs 3.5 

10 '126586 Hs.43086 AA011247 ESTs 3.5 

103106 Hs.1857 X62025 phosphodiesterase 6G; cGMP-specifte; rod 3.5 

116357 Hs.90797 AA504806 Homo sapiens clone 23620 mRN A sequence 3.5 

105309 Hs.4104 AA233790 ESTs 3.5 

130796 Hs.19525 R39390 ESTs 3.5 

15 109101 Hs.52184 AA167708 ESTs 3.5 

103134 Hs.2839 X65724 Nome disease (pseudoglioma) 3.5 

131798 Hs.301449 X86098 adenovirus 5 E1 A binding protein 3.5 

118535 Hs.49418 N67968 ESTs 3.5 

102592 Hs.11223 U62389 Human putative cytosolic NADP-dependent 3.4 

20 125905 Hs.6456 T69868 chaperonin containing TCP1;subunlt 2 (b 3.4 

109160 Hs.301997 AA179387 ESTs 3.4 

105327 Hs.211593 AA234440 ESTs 3.4 

106586 Hs.57787 AA456598 ESTs 3.4 

122635 AA454085 EST 3.4 

25 132413 Hs2601 16 AA1 32969 metalloprotease 1 (pitrilysin family) 3.4 

131938 Hs.34956 AA283620 ESTs 3.4 

133871 Hs.182793 AA454597 ESTs 3.4 

107175 Hs.292503 AA621751 ESTs; Weakly similar to KIAA0601 protein 3.4 

101188 Hs.184298 L20320 cyclin-dependent kinase 7 (homolog of Xe 3.4 

30 126422 Hs.237658 H48518 ESTs; Highly similar to apolipoprotein A 3.4 

118475 N66845 ESTs; Weakly similar to llll ALU CLASS 8 3.4 

104558 Hs.88959 R56678 ESTs; Weakly similar to !HI ALU SUBFAMI 3.4 

128307 Hs.132005 AI453794 ESTs 3.4 

112254 Hs.25829 R51831 ESTs 3.4 

35 125408 Hs.89578 N72353 yv37e12.r1 Soares fetal liver spleen 1NF 3.4 

109834 Hs.175955 H00604 ESTs 3.4 

130844 Hs.20191 D12122 seven in absentia (Drosophila) homolog 2 3.4 

127143 HS20843 AA533553 nj68h04.s1 NCI_CGAP_Pr10 Homo sapiens cD 3.4 

135309 Hs.42500 025984 ESTs 3.4 

40 125724 Hs.295978 AA083407 stimulated trans-acting factor (50 kDa) 3.4 

127692 Hs.187983 AI021912 ESTs 3.4 

116674 Hs.92127 F04816 ESTs 3.4 

134700 Hs.8868 AA481414 goigi SNAP receptor complex member 1 3.4 

114846 Hs.166196 AA234929 ESTs 3.4 

45 103649 Hs.155983 Z70219 H .sapiens mRNA for 5 , UTR for unknown pro 3.4 

134835 Hs.89925 L04569 calcium channel; voltage-dependent; L ty 3.4 

130568 Hs.16085 AA232535 ESTs; Highly similar to (defline not ava 3.4 

111331 Hs.15978 N78773 ESTs 3.4 

106036 Hs.10653 AA412505 ESTs 3.4 

50 130987 Hs.21893 R45698 ESTs 3.4 

112814 Hs.35828 R98192 ESTs 3.4 

127815 Hs.255015 AA876009 ob93c10.s1 NCI_CGAP_GCB1 HomosapfenscD 3.4 

100144 Hs.75616 D13643 KIAA001 8 gene product - 3.4 

101129 Hs.247992 L10405 Homo sapiens DNA binding protein for sur 3.4 

55 130874 Hs.20621 T08287 ESTs 3.4 

106882 Hs.26994 AA489009 ESTs 3.4 

103855 Hs.302267 AA195179 ESTs 3.4 

125957 H4521 3 yo03b08.r1 Soares adult brain N2b5HB55Y 3.3 

114048 Hs.146085 W94613 ESTs 3.3 

60 109826 Hs.75354 F13702 ESTs 3.3 

125355 Hs.170098 R45630 ESTs; Highly similar to KIAA0372 [H.sapi 3.3 

104182 Hs.143792 AA479990 ESTs; Weakly similar to glioma amplified 3.3 

100294 Hs.75454 D49396 Human mRNA for Apo1 .Human (MER5(Aop1-Mou 3.3 

131688 Hs.30692 U24153 p21 (CDKN1A)-activated kinase 2 3.3 

65 116256 Hs.88201 AA481256 ESTs; Weakly similar to (defline not ava 3.3 

102034 Hs.230 U05291 fibromodulin 3.3 

130072 Hs.14658 R99606 Human chromosome 5q13.1 clone 5G8 mRNA 3.3 

114615 Hs.159456 AA083812 ESTs; Highly similar to (defline not ava 3.3 

128707 Hs.104105 AA136474 Meis (mouse) homolog 2 3.3 
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102214 Hs.32964 U23752 SRY (sex-determining region V>box 11 3.2 

123147 AA487961 ab1 1h6.s1 Stratagene lung (#93721) Homo 3.2 

125435 Hs.272138 R00940 ye87g03.r1 Soares fetal liver spleen 1NF 3.2 

116246 Hs^50646 AA479961 ESTs; Highly similar to ubiquKin-coniug 3.2 

5 105169 Hs.180789 AA180321 Homo sapiens (clone S164) mRNA; 3 end o 32 

134001 Hs.78344 AF001548 myosin; heavy polypeptide 11; smooth mus 3.2 

124866 Hs.304389 R68571 ESTs 32 

133205 Hs.67619 AA089559 Homo sapiens mRNA; chromosome 1 specific 3.2 

102986 Hs.182378 X17648 colony stimulating factor 1 (macrophage) 3.2 

10 101232 Hs.242894 L28997 ADP-ribosyiation factor-like 1 3.1 

132906 Hs^34896 AA142857 ESTs; Highly similar to gemlnin [H^apie 3.1 

104281 Hs.5669 C14290 ESTs 3.1 

123926 Hs.227933 AA621348 ESTs; Highly similar to (defline not ava 3.1 

134464 Hs.239720 N79354 ESTs; Weakly similar to Rga [D.melanogas 3.1 

15 105322 Hs.16346 AA234100 ESTs 3.1 

100631 Hs.48332 HG2709-HT2805 Serine/Threonine Kinase (Gb225431) 3.1 

130791 Hs.199263 AA259102 ESTs; Highly similar to (defline not ava 3.1 

131220 Hs.300855 R77200 ESTs 3.1 

113237 Hs.123642 T62857 ESTs 3.1 

20 125562 HS.9896B AI494372 ESTs 3.1 

134110 Hs.79136 U41060 Human breast cancer; estrogen regulated 3.1 

132393 Hs.47334 W85888 ESTs; Moderately similar to ALU SUB 3.1 

107439 Hs£96842 W27895 ESTs; Moderately similar to noiwnuscle m 3.1 

125863 Hs.40719 AA299096 Homo sapiens mRNA; cDNA DKFZp564M0916 (f &1 

25 105811 Hs.286192 AA394121 ESTs 3.1 

129284 Hs296141 AA104023 ESTs 3.1 

125321 Hs.178294 T86652 ESTs 3.1 

107332 Hs.183297 T87750 ESTs 3.1 

123570 Hs.109653 AA608955 ESTs 3.1 

30 100384 Hs.90800 D83646 matrix metaltoproteinase 16 (membrane-in 3.1 

109063 Hs-38972 AA161043 tetraspan 1 3.1 

133284 Hs.182828 U09367 zinc finger protein 136 (clone pHZ-20) 3.1 

131839 Hs.33010 H80622 Homo sapiens mRNA for K1AA0633 protein; 3.1 

117606 Hs.44698 N35115 ESTs 3.1 

35 418998 Hs.287849 F13215 ESTs 3.1 

125180 Hs.103120 W58344 ESTs 3.1 

100789 HG3893-HT4163 Phosphoglucomutase 1, AIL Splice 3.1 

126017 Hs.159440 H60487 ESTs 3.1 

132452 Hs£47324 AAQG5262 Homo sapiens DNA sequence from PAC 262D1 3.1 

40 129077 Hs.108479 H78886 ESTs 3.1 

126563 Hs.181368 W26247 US snRNP-spectfic protein (220 kD); orth 3.1 

129650 Hs.118258 N52554 ESTs 3.1 

123465 AA599033 ESTs 3.1 

126486 Hs.152316 AA345339 EST51 345 Gail bladder li Homo sapiens cD 3.1 

45 126460 Hs.167031 W01616 za36d05.rl Soares fetal Over spleen INF 3.1 

118697 Hs.43234 N72094 ESTs 3.1 

103860 Hs-38057 AA203742 ESTs 3.1 

127968 Hs.124347 AA971439 ESTs 3.1 

124984 Hs.223241 T47566 yt)15c1U1 Stratagene placenta (#937225) 3.1 

50 103903 Hs.15220 AA249334 j312^eqP Human fetal heart, Lambda ZAP 3.1 

106697 Hs.22242 AA463737 ESTs 3.1 

130892 Hs.20993 AA442604 ESTs; Weakly simBar to Ydr374cp [S.cere 3 

114032 Hs.35014 W92779 ESTs - 3 

128835 Hs.106390 W15528 ESTs 3 

55 103667 HS247815 Z80788 H.sapiens H4/I gene 3 

126264 Hs-250614 N42897 yy13h06.r1 Soares melanocyte 2NbHM Homo 3 

132626 H&21275 D25755 ESTs 3 

131107 Hs.75354 N87590 ESTs 3 

126780 Hs.5811 R12421 ESTs 3 

60 127363 Hs.22116 AA307744 Homo sapiens Cdc146 1 phosphatase mRNA; c 3 

103690 Hs.84063 AA016186 ESTs 3 

102589 Hs.8867 U62015 Homo sapiens Cyr61 mRNA, complete cds 3 

125144 Hs.24336 W37999 ESTs 3 

132977 Hs.301404 U28686 RNA binding motif protein 3 3 

65 120714 Hs.146170 AA292689 ESTs 3 

101038 Hs.79411 J05249 replication protein A2 (32kD) 3 

102856 Hs.248177 X00090 Human histone H3 gene 3 

105516 Hs.30738 AA257971 ESTs 3 

131137 Hs.33287 U85193 nuclear factorl/B 3 
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127221 HS341551 AI354332 ESTs 3 

411888 Hs.24104 R26708 ESTs 3 

131684 Hs.3066 U26174 granzyme K (serine protease; granzyme 3; 3 

100629 Hs£1291 HG2706-HT2802 Serine/Threonine Kinase (Gb225428) 3 

5 119944 Hs.58915 W86838 EST 3 

113801 Hs.118281 W38418 zinc finger protein 266 3 

133780 Hs.76152 M14219 decorin 3 

104690 Hs.14449 AA010889 ESTs 3 

126371 Hs.304139 N57645 EST 3 

10 127635 Hs.1 16346 AA766903 ESTs 3 

128434 Hs.143880 AI190914 ESTs 3 

435761 Hs.187555 AA701941 ESTs 3 

125025 Hs.50748 T71561 ESTs 3 

124940 Hs.103804 R99599 heterogeneous nuclear ribonucteoprotein 3 

15 128742 Hs£51531 D00763 proteasome (prosome; macropain) subunii; 3 

107147 Hs.10450 AA621125 Homo sapiens chromosome 2; 10 repeat reg 3 

112068 HS22545 R43910 ESTs 3 

105346 Hs363727 AA235465 ESTs; Moderately similar to !!!! ALU SUB 3 

130972 Hs.21739 AA370302 Horra sapiens mf^; cONA DKFZ^ 3 

20 131230 Hs£74407 AA149987 thymus specific serine peptidase 3 

133743 Hs.75847 N79435 ESTs 3 

127402 H&227949 AA358869 ESTs; Highly similar to SEC13-RELATED PR 3 

117483 Hs.44189 N30426 ESTs 3 

123659 Hs.1 12699 AA609368 ESTs 3 

25 • 1G3963 Hs.63290 AA298588 EST114219 HSC172 cells II Homo sapiens c 3 

103795 Hs.7367 AA1 12222 ESTs; Moderately similar to (defline not 3 

115092 Hs.80975 AA255903 CD39-iike4 2.9 

134831 Hs.89890 S72370 pyruvate carboxylase 2.9 

128579 Hs.101810 AA093378 ESTs; Weakly similar to !!!! ALU SUBFAMl 2.9 

30 134193 HS.7980 F09570 ESTs 2.9 

123522 Hs.112575 AA608577 ESTs 2.9 

107109 Hs.32793 AA609943 ESTs 2.9 

134694 Hs.88556 D50405 histone deacetytese 1 2.9 

134399 Hs.82689 H99801 tumor rejection antigen (gp96) 1 2.9 

35 134632 Hs.174139 AA398710 H. sapiens RNA for CLCN3 2.9 

106683 Hs.14512 AA461495 ESTs 2.9 

108555 AA084963 zn13e12.s1 Stratagene hNT neuron (#93723 2.9 

100953 Hs.2110 HG945-HT945 Nucleic Acid-Binding Protein (Gb:L12693) 2.9 

130597 Hs.16492 AA173998 ESTs; WeaWy similar to weaWy similar t 2.9 

40 101813 Hs.139226 M87338 replication factor C (activator 1) 2 (40 2.9 

106636 Hs.286 AA459950 ESTs 2.9 

129109 Hs.108708 AA491295 calcium/caJmodulin-dependent protein kin 2.9 

125819 Hs.251871 AA044840 stromal cell-derived factor 1 2.9 

106282 Hs.9857 AA433946 ESTs; Weakly similar to (defline not ava 2.9 

45 100386 Hs.301636 D83703 peroxisomal biogenesis factor 6 2.9 

114546 Hs.98074 AA056263 ESTs; Moderately similar to 1111 ALU SUB 2.9 

105914 Hs.9701 AA402224 Homo sapiens growth arrest and DNA-damag 2.9 

108552 AA084912 zn11c7.s1 Stratagene hNT neuron (#937233 2.9 

126505 Hs.190057 W26894 16a11 Human retina cDNA randomly primed 2.9 

50 134098 Hs.79086 X06323 Human MRL3 mRNA for ribosomal protein L3 2.9 

129721 Hs.211539 L19161 eukaryotic translation initiation factor 2.9 

100076 Hs277422 AB000897 Homo sapiens mRNA for cadherin FIB3, par 2.9 

117466 Hs.44104 N29862 ESTs - 2.9 

106335 Hs.36688 AA437258 ESTs; Moderately similar to WAP four-dis 2.9 

55 134510 Hs.250870 U25265 protein kinase; mftogen-activated; Wnas 2.9 

105835 Hs.32995 AA398412 ESTs 2.9 

106611 Hs.26267 AA458904 ESTs; WeaWy similar to torsinA [H.sapie 2.9 

134087 Hs.173824 U51166 thymlne-ONA glycosylase 2.9 

100641 Hs.182183 HG2743-HT2846 Caldesmon 1, AH Splice 4, Non-Muscle 2.9 

60 104602 R86920 ESTs 2.9 

117203 Hs.42738 H99799 ESTs 2.9 

131889 Hs.34073 AA401912 BH-protocadherin (brain-heart) 2.9 

101707 Hs.155212 M65131 methylmalonyl Coenzyme A mutase 2.9 

115271 Hs.5724 AA279422 ESTs 2.9 

65 125812 Hs.287912 H73420 lectin; mannose-binding; 1 2.9 

110740 Hs.19762 H99675 ESTs 2.9 

103406 Hs^85728 X95677 H^apiens mRNA for ArgBPlB protein 2.9 

• 104577 Hs.132390 R71539 ESTs 2.9 

102772 Hs.161002 U831 15 absent in melanoma 1 2.9 
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131710 Hs.30985 AA233225 ESTs; Highly similar to (define not ava 2.9 

125231 Hs.268903 W84714 ESTs 2.9 

127380 Hs.15535 A1417137 Homo sapiens clone 24582 mRNA sequence 2.9 

104229 Hs.61289 AB002346 inositol phosphate 5-phosphatase 2 (syn 2.9 

5 126600 Hs.191385 AA699949 ESTs 2.9 

125175 Hs.303030 W52355 EST 2.9 

103849 Hs.34578 AA187045 ESTs; Weakly similar to ill! ALU SUBFAMI 2.9 

102126 Hs.78961 U14575 protein phosphatase 1; regulatory (inhib 2.9 

124906 Hs.107815 R87647 ESTs 2.9 

10 131148 Hs.303125 C00038 ESTs 2.9 

123158 Hs.218329 AA488658 heat shock 70kD protein 1 2.9 

133667 Hs.75462 U72649 Human BTG2 (BTG2) mRNA; complete cds 2.9 

105182 Hs.18271 AA191014 ESTs; Weakly similar to Ydr372cp [S.oere 2.9 

133968 Hs£32068 D15050 Human mRNA for transcription factor AREB 2.9 

15 117425 HS536901 N27154 ESTs 2.9 

111087 Hs.37637 N59645 ESTs 2.9 

129641 Hs.11805 N66066 ESTs 25 

128639 Hs.102897 N91246 ESTs 2.9 

133209 H&79265 AA1 14183 ESTs; Moderately similar to glutamate py 2.9 

20 135154 H&267812 AA126433 sorting nexin 4 2.9 

126838 HSJ279609 AA858097 pigment epithelium-derived factor 25 

103803 Hs.106149 AA127696 ESTs 2.9 

102139 Hs£128 U15932 dual specificity phosphatase 5 2.9 

128104 AA971000 op67g1U1 Soares_NFLJ_GBC_S1 Homosapi 2.8 

25 127834 Hs.337631 AA761415 nz22dQ8.s1 NC|_CGAP_GCB1 Homo sapiens cD 2.8 

133101 Hs.180952 AA488230 ESTs 25 

127250 H&217916 AI023717 ESTs 2.8 

135063 Hs.93883 D10537 myelin protein zero (Charcot-Marie-Tooth 2.8 

126323 Hs.68644 N45014 yy80g06.r1 Soares.multiple.sclerosis^Nb 2.8 

30 121873 Hs.145696 AA426270 ESTs 2.8 

122090 Hs.98684 AA432141 ESTs 2.8 

118728 Hs.322645 N73705 ESTs 2.8 

135400 Hs.99915 M23263 androgen receptor (dihydrotestosterone r 2,8 

125278 Hs.129998 W93523 ESTs 2.8 

35 124387 Hs.109019 N27637 ESTs 2.8 

124803 Hs.12186 R45480 cyclinK 2.8 

H45968 Hs.32149 H45968 ESTs 25 

104261 Hs.5409 AF008442 RNA polymerase I subunit 2.8 

105366 Hs^82093 AA236356 ESTs 2.8 

40 106070 Hs.5957 AA417761 Homo sapiens clone 24416 mRNA sequence 2.8 

131356 H&25960 M13241 v-myc avian myelocytomatosis viral relat 2.8 

112009 Hs.26255 R42714 EST 2.8 

133199 Hs.250175 AA609773 Homo sapiens clone 23904 mRNA sequence 2.8 

110379 Hs.33130 H44825 ESTs 25 

45 103890 Hs.72085 AA236843 ESTs; Weakly similar to unknown [S.cerev 25 

128152 R20353 yg20f10.fi Soares infant brain 1NIB Homo 25 

107008 Hs£3740 AA598710 ESTs 2.8 

135243 HS57101 AA215333 ESTs 2.8 

103058 Hs.184510 X57348 stratifin 2.8 

50 132020 Hs.293845 AA428990 ESTs 2.8 

116354 Hs.292566 AA504262 ESTs 2.8 

125867 Hs.12372 H98141 ESTs 25 

120603 Hs.98541 AA282787 ESTs; Highly similar to (defline not ava - 2.8 

115119 Hs.46847 AA256524 Human DNA sequence from clone 30M3 on ch 2.8 

55 133865 Hs.170290 F09315 discs; large (Drosophita) homolog 5 25 

109415 Hs.1 10826 AA227219 Homo sapiens CAGF9 mRNA; partial cds 25 

128687 Hs.23767 Z38910 ESTs 25 

109984 Hs.10299 H09594 ESTs; Moderately similar to III! ALU SUB 25 

133179 Hs.66731 U81599 homeoboxB13 25 

60 115998 Hs.336629 AA448488 ESTs; Weakly similar to zinc finger prot 25 

112180 Hs.25067 R49116 EST 25 

120428 Hs.173694 AA236822 ESTs; Moderately similar to (defline not 25 

106241 Hs.6019 AA430108 ESTs 25 

131060 Hs.22564 AA160890 myosin VI 25 

65 111383 Hs.40919 N94527 ESTs 2.8 

102123 Hs.1594 U14518 centromere protein A (17kO) 25 

102722 Hs.79981 U79242 Human done 23560 mRNA sequence 25 

129887 Hs.274324 W92041 PCAF associated factor 65 alpha 25 

126663 Hs.181297 AA714635 ESTs 25 
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04367 Hs.1 34342 H17438 ESTs; Weakly similar to seventransmembra 2.8 

07316 Hs.193700 T63174 ESTs; Moderately similar to Hli ALU SUB 2.8 

28059 Hs.145096 AA972446 ESTs 2JB 

24447 N48000 ESTs 2.8 

11398 Hs.125565 R00086 deafness; X-Bnked 1; progressive 2.8 

34085 Hs.79018 U20979 chromatin assembly factor I (150 kDa) 2.8 

24788 Hs.100912 R43543 ESTs 2.8 

12248 Hs.326416 R51361 ESTs 2£ 

21309 Hs.97312 AA402482 ESTs 2.8 

03076 Hs.75319 X59618 ribonucleotide reductase M2 polypeptide 2.8 

07071 Hs.35198 AA609053 ESTs 2J8 

04425 Hs.35380 H88496 ESTs 2.8 

32991 Hs.62245 AA446906 solute carrier family 25 (mitochondrial 2.8 

04968 Hs.29669 AA084602 ESTs 2.8 

21153 Hs.97694 AA399640 ESTs 2.8 

31216 Hs.243901 D31058 ESTs 2.8 

382 Hs.22869 F09299 ESTs 2.8 
31990 Hs.168818 H77734 ESTs; Moderately similar to roundabout 1 2.8 
32027 Hs.181444 N78844 ESTs; Weakly similar to R12C12.6 [Celeg 2.8 
27383 Hs.190478 AA447990 . ESTs 2.8 
32598 Hs.530 M81379 collagen; type IV; alpha 3 (Goodpasture 2.8 
01121 Hs.1313 L09753 tumor necrosis factor (ligand) superfami 2.8 
23000 Hs.105640 AA479347 ESTs 2.8 
21329 Hs.1755 AA404324 ESTs . 2.8 
00481 Hs.121489 HG1098-HT1098 CystatinD 2.7 
13803 Hs.283683 W42789 ESTs 2.7 
10934 Hs.169001 N48708 ESTs; Weakly similar to cytochrome P-450 2.7 

432888 T86823 ESTs 2.7 

21802 Hs.188898 AA424328 ESTs 2.7 

30396 Hs.155313 AB002331 Human mRNA for KIAA0333 gene; partial cd 2.7 

21103 Hs.97697 AA398936 ESTs; Weakly similar to (defline not ava 2.7 

31129 Hs.23240 R27296 ESTs 2.7 

30943 Hs.272429 D50855 calcium-sensing receptor (hypocalciuric 2.7 

34676 Hs.87819 W28051 ESTs; Weakly similar to keratin 9; cytos 2.7 

11900 Hs.25318 R39044 ESTs 27 

06025 Hs.173334 AA412063 ESTs 2.7 

26144 Hs.40639 N39696 yx92a07 j1 Scares melanocyte 2NbHM Homo 2.7 

03248 Hs.75262 X77383 cathepsinO 2.7 

27230 Hs.274170 H30501 Homo sapiens Opa-interacting protein OIP 2.7 

01584 Hs.84072 M35252 transmembrane 4 superfamiiy member 3 2.7 

24131 Hs.167489 H19980 ESTs 2.7 

29689 Hs.77873 M130156 ESTs 27 

!32892 Hs.9973 W92797 ESTs 27 

120827 Hs.132967 AA347717 ESTs 27 

34579 Hs.85963 N23222 ESTs; Moderately similar to 111! ALU SUB 27 

06149 Hs^56301 AA424881 ESTs 27 

32037 Hs.332541 AA203649 ESTs; Weakly similar to HEM45 [H^apiens 27 

30542 Hs.179825 U64675 Human sperm membrane protein BS-63 mRNA, 27 

22851 Hs.99598 AA463627 ESTs 2.7 

383 Hs.1 96384 D28235 prostaglandtn-endoperoxide synthase 2 (p 27 
20537 Hs.160422 AA262790 ESTs 27 
31036 Hs.174140 X64330 ATP citrate lyase 2.7 

389 Hs.211582 AA099391 ESTs * 27 

28847 Hs.106529 AA424199 zv81e01.M SoaresJotal_fetus_Nb2HF8_9w 27 

12755 Hs.306044 R93802 ESTs 2.7 

423239 AA323591 EST26392 Cerebellum II Homo sapiens cDNA 27 

05031 Hs.12321 AA127240 ESTs 2.7 

26021 Hs.187516 AA775894 ESTs 2.7 

02116 U 13706 Human ELAV-like neuronal protein 1 isofo 27 

33394 Hs.237225 R16759 ESTs; Weakly similar to (defline not ava 27 

04267 Hs.278439 C00358 ESTs 27 

07614 Hs.40241 AA004878 ESTs; Highly similar to (defline not ava 27 

29809 Hs.1259 X55283 asialoglycoproteln receptor 2 27 

12109 Hs.283309 R45221 ESTs; Weakly similar to III! ALU SUBFAMI 27 

28422 T85681 yd60c06/1 Soares fetal liver spleen 1NP 27 

09494 Hs.43899 AA233702 ESTs 2.7 

18696 Hs.292284 N72086 Homo sapiens RNA polymerase 111 largest 27 

,06053 Hs.36727 AA416963 ESTs; Highly similar to histone H2A[H.s 27 

04440 Hs.284380 L20492 gamrna-gtutamyltransferase 1 27 
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129426 Hs.111323 AA412087 EST; Highly similar to (defline not aval 2.7 

123798 AA620411 small inducible cytokine A5 (RANTES) 2.7 

106716 H&238928 AA464962 ESTs 2.7 

103663 Z78291 Z78291 Homo sapiens brain fetus Homo sap 2.7 

5 114162 Hs.22265 Z38909 ESTs 2.7 

113063 Hs.5027 T32438 ESTs 2.7 

127897 AA773857 af80c09.r1 SoaresJlhHMPtLSI Homo sapiens 2.7 

130621 Hs.16803 AA621718 ESTs; Weakly similar to (defline not ava 2.7 

1 16245 Hs.42796 " AA479958 ESTs; Highly similar to {defline not ava 2.7 

10 125499 R11878 yf49d1 1 .rl Soares infant brain 1 NIB Homo 2.7 

133960 Hs.77899 M19267 tropomyosin 1 (alpha) 27 

104470 Hs.246358 N28843 ESTs; Weakly similar to SimBar to colla 2.7 

134982 Hs.92308 N46086 ESTs 2.7 

106803 Hs.284295 AA479114 ESTs 2.7 

15 104899 Hs.285574 AA054726 ESTs 2.7 

125401 Hs.337585 AI204637 ESTs; Moderately similar to KIAA0350 [H. 2.7 

111253 Hs.15768 N70042 ESTs; Moderately similar to !!!! ALU SUB 27 

118449 Hs.164478 N66413 ESTs; Weakly similar to (defline not ava 27 

134507 Hs.84318 M63488 replication protein A1 (70kD) 27 

20 121609 Hs.98185 AA416867 EST 2.7 

113835 Hs.27475 W56590 ESTs 2.7 

1 13962 Hs^85290 W86375 ESTs; Highly similar to (defline not ava 27 

121913 Hs.98558 AA428062 ESTs 27 

108194 Hs£16717 AA057250 ESTs 2.7 

25 130799 Hs.12696 AA464273 ESTs 27 

123184 Hs.18166 AA489072 Homo sapiens mRNA for KIAA0870 protein; 27 

103420 Hs.173497 X97065 SEC23-like protein B 27 

106186 Hs.6315 AA427398 acetylserotoninN-methyitransferase-like 27 

101349 L77559 Homo sapiens DGS-B partial mRNA 2.7 

30 112954 Hs.6655 T16559 ESTs 2.7 

133054 Hs.291079 R07876 ESTs; Weakly similar to unknown [S.cerev 27 

128131 Hs.25640 A1283162 claudin3 2.6 

101864 Hs.75777 M95787 transgelin 2.6 

111948 Hs.26303 R40752 ESTs 2.8 

35 130145 Hs.151051 U07620 protein kinase mitogen-activated 10 (MAP 2.6 

126507 Hs.23964 AI362218 ESTs 2.6 

117903 Hs.47111 N50740 ESTs 2.6 

116345 Hs.199067 AA496981 ESTs 2.6 

132227 Hs.4248 AA412620 ESTs 2.6 

40 125746 Hs.274256 H03574 yj42b06 Jl Soares placenta Nb2HP Homo sa 2.6 

105073 Hs.89463 AA137034 ESTs 2.6 

1 02764 U82310 Homo sapiens unknown protein mRNA, parti 2.6 

131367 Hs.173933 AA456687 ESTs 2.6 

130792 Hs.19500 AA307896 nuclear localization signal deleted In v 2.6 

45 107427 Hs.46736 W26975 ESTs 2.6 

117477 Hs.44175 N30328 ESTs 2.6 

106290 Hs.16364 AA435542 ESTs 2.6 

126829 Hs7910 R11547 ESTs 2.6 

118836 Hs.173001 N79820 ESTs 2.6 

50 100147 Hs.136348 D13666 osteoblast specific factor 2 (fascteHn 2.6 

104278 Hs.109253 C02582 ESTs; Highly similar to (defline not ava 2.6 

135051 Hs.83484 C15324 ESTs 2.6 

126081 Hs.227835 A1346024 collagen; type I; alpha 1 * 2.6 

123579 AA608983 af5d4.s1 SoaresJestis.NHT Homo sapiens 2.6 

55 130115 Hs.149923 M31627 X-box binding protein 1 2.6 

101434 Hs.1430 M20218 coagulation factor XI (plasma thrombopla 2.6 

122962 Hs.104720 AA478429 ESTs; Moderately similar to llil ALU SUB 2.6 

126151 Hs.40808 AA324743 ESTs 2.6 

128925 Hs.21851 D61676 Homo sapiens mRNA; cDNA DKFZp586J21 18 (f 2.6 

60 126919 Hs.103391 L27559 insulin-like growth factor binding prate 2.6 

130296 Hs.154103 R09286 UM protein (similar to rat protein kina 2.6 

128402 Hs.191637 AA457244 ESTs 2.6 

129273 Hs.109968 W63783 ESTs 2.6 

125483 Hs.7788 F07759 ESTs 2.6 

65 132953 Hs.321264 AA029927 ESTs 2.6 

130963 Hs.21639 U57099 nuclear protein; marker for differential 2.6 

120614 Hs.194154 AA284281 ESTs; Weakly similar to lUI ALU SUBFAMI 2.6 

123251 Hs.103267 AA490858 ESTs; Moderately similar to Rabln3 (Rjio 2.6 

121710 Hs.96744 AA419011 ESTs 2.6 
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125428 Hs.851 W74608 ESTs; Highly similar to (defline not ava 2.6 

115906 Hs.82302 AA436616 ESTs 2.6 

108432 AA076626 Homo sapiens clone 23851 mRNA sequence 2.6 

126191 Hs.191911 H97728 ESTs 2.6 

5 106164 Hs^81434 AA425773 ESTs 2.6 

111519 Hs^68615 R08165 ESTs 2.6 

134590 Hs.173840 W58612 ESTs 2.6 

102565 U59748 Human desert hedgehog (hDHH) mRNA, parti 2.6 

129879 Hs.13109 AA194973 ESTs 2.6 

10 114264 Hs.334609 Z40074 ESTs 2.6 

106236 Hs.21104 AA429951 ESTs 2.6 

135192 Hs.321709 AF000234 purinergic receptor P2X; ligand-gated k) 2.6 

109833 Hs.29889 H00580 ESTs 2.6 

105756 Hs.8535 AA30308B ESTs; Weakly similar to transformation-r 2.6 

15 121422 Hs.97967 AA406210 ESTs 2.6 

130417 Hs.155485 U58522 Human huntingtin interacting protein (HI 2.6 

124312 Hs.102329 H94647 ESTs 2.6 

108998 Hs.97199 AA156058 ESTs 2.6 

127081 Hs.180591 R88362 ESTs; Weakty similar to weak similarity 2.6 

20 129574 Hs.11463 AA458603 ESTs; Weakly similar to (defRne not ava 2.6 

112410 Hs.26904 R61680 ESTs 2.6 

123929 Hs.1 12981 AA621364 ESTs 2.6 

122905 Hs.104835 AA470070 ESTs 2.6 

116399 Hs.110637 AA599729 Homo sapiens homeobox protein A10(HOXA1 2.6 

25 130279 Hs.153934 AA424044 core-binding factor; runt domain; alpha 2.6 

130021 Hs.1435 M24470 guanosine monophosphate reductase 2.6 

100585 Hs.199160 HG2367-HT2463 Trithorax Homolog Hrx 2.6 

104965 Hs.30177 AA084104 ESTs 2.6 

117711 Hs.46485 N45201 EST 2.6 

30 124792 Hs.48712 R44357 ESTs 2.6 

111299 Hs.74313 N73808 ESTs 2.6 

103616 Hs.32971 246973 -phosphoinositide-3-klnase; class 3 2.6 

133629 Hs.195614 D13642 KIAA0017 gene product 2.6 

126484 Hs.169977 AI086782 ESTs 2.6 

35 100858 HG4245-HT4515 Forkhead Family Aixl 2.6* 

133547 Hs.301927 X02883 T-ceil receptor; alpha (V;D;J;C) 2.6 

126680 Hs.133865 F07097 ESTs 2.6 

125739 Hs.92137 AA428557 v-myc avian myelocytomatosis viral oncog 2.6 

102276 Hs.10247 U30999 Human (memc) mRNA, 3'UTR 2.6 

40 105586 Hs.191538 AA279137 ESTs 2.6 

103978 Hs.34136 AA307443 ESTs 2.6 

125054 Hs.268601 T80622 ESTs; Weakly similar to (defline not ava 2.6 

114212 H&21201 Z39338 ESTs; Highly similar to (defline not ava 2.6 

116959 Hs.40022 H79310 EST 2.6 

45 109228 Hs.306995 AA193366 ESTs 2.6 

133989 Hs.78202 U29175 SWI/^IF related; matrix associated; act! 2.6 

100640 Hs.182183 HG2743-HT2845 Caldesmon 1, AH Splice 3, Non-Muscle 2.6 

133093 Hs.285996 AA598749 ESTs 2.6 

114306 Hs.6540 240861 ESTs 2.6 

50 106060 Hs.171391 AA417287 C-termfnal binding protein 2 2.5 

107748 Hs.60772 AA017258 EST 2.5 

100134 Hs.49 D13264 macrophage scavenger receptor 1 2.5 

133969 Hs.78 U13044 GA-binding protein transcription factor; - 2.5 

130992 Hs.74316 AA455001 ESTs 2S 

55 127493 Hs.291701 AA808081 oc39a08.s1 NCLCGAPGCB1 Homo sapiens cD 2.5 

132869 Hs.203961 N26855 ESTs 2.5 

117570 Hs.44583 N34415 EST 2.5 

124644 Hs.109654 N91Z79 ESTs 2.5 

103558 Hs.2785 219574 keratin 17 2.5 

60 132883 Hs.5897 AA047151 ESTs 2.5 

102009 Hs.82643 U02680 protein tyrosine kinase 9 2.5 

116058 Hs.20159 AA454156 ESTs 23 

121989 Hs.193784 AA430O44 ESTs 2B 

131257 Hs.24908 AA256042 ESTs 2JS 

65 100320 Hs.75275 D50916 homolog of yeast (S. cerevisiae) ufd2 2.5 

102959 Hs.121524 X15722 glutathione reductase 2.5 

132969 Hs.6166 AA047616 ESTs 2.5 

130869 Hs.2057 AA128100 uridine monophosphate synthetase (orotat 2*5 

129645 Hs.1 18131 L38928 5;10-methenyttetrahydrofolate synthetase 2JS 
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126399 Hs.83883 AA12807S 
134069 Hs.78935 U29607 
109816 Hs.61960 F11013 
134801 Hs.89695 X02160 
104232 Hs.10587 AB002351 
107361 Hs.159486 U72513 
106057 Hs.289074 AA417067 
134252 Hs.80720 AA031782 
128062 Hs.105547 AA379500 
110009 Hs.6614 H10933 
111375 Hs.20432 N93696 
122642 Hs.99361 AA454186 
127999 Hs.69851 AA837495 
105029 Hs.13268 AA126855 
105082 H&26765 AA143763 



zJ16d08.rl Soaresj)regnanLuterus_NbHPU 2.5 

Homo sapiens elF-2-associated p67 homolo 2.5 

ESTs; Weakly similar to KIAA0176 [H^api 2.5 

insulin receptor 2.5 

Human mRNA for KIAA0353 gene; partial cd 2.5 

Human RPL13-2 pseudogene mRNA; complete 2.5 

ESTs 2.5 

Homo sapiens mRNA; cDNA DKFZp586B1722 (f 25 

ESTs 2.5 

ESTs 2.5 

ESTs 2.5 

ESTs 2.5 

ESTs; Weakly similar to Wiskott-Aldrich 2.5 

ESTs 2.5 

ESTs; Weakly similar to Similarity to S. 2.5 
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TABLE 1 A show the accession numbers for those primekeys lacking unigenelD's for Table 
1. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accessions 



108552 1 1 1555 1 AA071210 M069899 AA071438 AA084912 AA084803 AA079371 AA079370 

126023 1596090 J H57661 H58881 

126086 1606216J H75681 H70975 

102565 32479J AB010994 U59748 AA064660 

101964 48158_-7 S81578 

125499 1562851J H10543R11878 

125596 1708455J R25698 R56582 R56018 

1 18417 37186 1 AF080229 AF080231 AF080230 AF080232 AF080233 AF080234 BE550633 A1636743 AW614951 BE467547 AI680833 

AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 AI583718 AI672574 
N25695 AW665466 AI818326 AA126128 AI480345 AW013827 AA248638 AI214968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 AI080480 AI631703 AI651023 Ai867418 AW818140 AA5Q2500 AJ206199 AI671282 
A135254S BB01030 AI652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 
6E46661 1 AJ206344 AA574397 AA348354 AI493192 

125661 327827J AA491830 R50173 R55192 R50320 AI732306 AI732305 AI820727 AI820728 R55191 R50319 R50227 

125957 1583542J H41694H45213 

125982 1766315J R98091 W92898 

127248 227560J AA364195 AA325029 AW962050 

103731 112052 1 AA070545 AA131490 AA131373 

127261 231687J AA330501 AA661567 

127265 232391J AA331503 AA332751 AW962542 

126659 1541209 1 T16245 R19694 F13545 H10299 T66048 T65279 H18006 

127315 37938J AF1 16622 All 14507 AA640834 AA377999 

103806 112616.1 AA1 30614 AAQ71410 

128104 502608 1 AA906093 AA971000 

104602 524482 J K47610R86920 

128152 297868J F07973 R20353AA442660 

128422 1811283J T77794T85681 

127897 446527 1 AA773681 AA773857 

106566 120358 J BE298210 AI672315 AW086489 BE298417 AA455921 AA902537 BE327124 R14963 AA085210 AW274273 A1333584 

A1369742 AI039658 A1885095 AI476470 A1287650 AJ885299 AI985381 AW592624 AW3401 36 A1266556 AA456390 
AI310815AA484951 

129735 44573.2 AI950087 N70208 R97040 N36809 AI3081 19 AW967677 N35320 A1251473 H59397 AW971573 R97278 W01059 

AW967671 AA908598 AA251875 AI820501 A1820532 W87891 T85904 U71456 T82391 BE328571 T75102 R34725 
AA884922 BE328517 AI219788 AA884444 N92578 F13493 AA927794 AI560251 AW874068 AL134043 AW235363 
AA663345 AW008282 AA488964 AA283144 AI890387 AI950344 AI741346 AI689062 AA282915 AW102898 AI872193 
AI763273 AW173586 AW150329 AI653832 AI762688 AA988777 AA488892 AI356394 AW103813 AI539642 AA642789 
AA856975 AW505512 Ai961530 AW629970 BE612861 AW276997 AW513601 AW512843 AA044209 AW856538 
AA180009 AA337499 AW961 101 AA251669 AA251874 AI819225 AW205862 AI683338 A1858509 AW276905 AI633006 
AA972584 AA908741 AW072629 AW513996 AA293273 AA969759 N75628 N22388 H84729 H60052 T92487 AI022058 
AA780419 AA551005 W80701 AW613456 AI373032 AI564269 F00531 H83488 W37181 W78802 R66056 Ai002839 
R67840 AA300207 AW959581 T63226 F04005 

123147 219802 -2 AA487961 

130529 158447J AA1 78953 AA1 92740 

123579 genban!eAA608983 AA608983 

109175 genbanH_AA180496 AA180498 

100789 tfgr_HT4163 S67998 

100858 figLHT4515 U10072 
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10 



15 



20 



123798 
102116 
102398 
102764 
118475 
104776 
104787 
113702 
113938 
122635 
108407 
108432 
108555 
101349 
124447 
119071 
103520 
103663 
128046 



123465 



579959J 

entra2LU13706 

entrez_U42359 

entre2LU82310 

genbank_N66845 

genbanK^AA026349 

genbanLAA027317 

genbanK_T97307 

genbanleW81598 

QQnbanK_AA454085 

genbank_AA075519 

genbanK_AA076626 

genbanieAA084963 

entra*_L77559 

genbank_N48000 

genbanK.R31180 

entre^.Y10511 

genbanleZ78291 

877605 1 

546044J 

genbanK_AA599033 



AA620411 AA287491 

U13706 

U42359 

U82310 

N66845 

AA026349 

AA027317 

T97307 

W81598 

AA454085 

AA075519 

AA076626 

AA084963 

L77559 

N48000 

R31180 

Y10511 

278291 

AA873285AI025762 
AA199853AA206355 
AA599033 
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TABLE 2: shows a preferred subset of the Accession numbers for genes found in Table 1 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 
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Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unjgene Title: Unigene gene title 

R1 : Ratio of tumor to normal body tissue (Relaxed ratio (87/70) 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Pkey ExAccn UnigenelD Unigene Title 



131919 
120328 
101486 
119073 
133428 
128180 
104080 
127637 
131665 
101050 
130771 
107485 
106155 
129534 
100569 
101889 



M121266 

M196979 

M24902 

R32894 

M34376 

AA595348 

AA402971 

AA569531 

R22139 

K01911 

N48056 

W63793 

AA425309 

R73640 



Hs272458 

Hs290905 

Hs.1852 

Hs279477 

Hs.183752 

Hs.171995 

Hs.57771 

Hs.162859 

Hs.30343 

Hs.1832 

Hs.1915 

Hs262476 

Hs.33287 

Hs.11260 



ESTs 

ESTs; Weakly similar to (defline not ava 

add phosphatase; prostate 

ESTs 

microserninoprotein; beta- 

kallikrein 3; (prostate specific antigen 

Homo sapiens mRNA for serine protease (T 

ESTs 

ESTs 



folate hydrolase (prostate-specific memb 
S-adenosylmethionine decarboxylase 1 
ESTs 
ESTs 



HG2261-HT2351 
S39329 Hs.181350 



133944 
130974 
114768 
104660 
131061 
126645 
135153 
107033 
118417 
126758 
107102 
116787 
115719 
123209 
101664 
112971 
117984 
129523 
132964 
121853 
119617 
105627 
101461 
124526 
133845 
133354 
119018 
100394 
106579 
114965 
112033 
102398 
101201 
101803 
120562 



Hs.99872 

Hs.7780 

Hs2178 

Hs.182339 

Hs.14846 

Hs.268744 

Hs.61635 

Hs.95420 

Hs.113314 

Hs293960 

Hs.30652 

Hs.15641 

Hs.59622 

Hs203270 

Hs.121017 

Hs.83883 

Hs.106778 

Hs274509 

Hs.167133 

Hs.98502 

Hs.55999 

Hs23317 

Hs.76422 

Hs293185 

Hs.76704 

Hs.334762 

Hs278695 

Hs.66052 

Hs23023 

Hs.72472 

Hs22627 



L22524 Hs2256 
M86546 Hs.155691 
AA280036 Hs.302267 



U05237 

AA045870 

X57985 

M149007 

AA007160 

N64328 

AI167942 

N40141 

AA599629 

N66048 

W37145 

AA609723 

H28581 

AA416997 

AA489711 

M60752 

T17185 

N51919 

M30894 

AA031360 

AA425887 

W47380 

AA281245 

M22430 

N62096 

T68510 

M055552 

N95796 

D84276 

AA456135 

AA250737 

R43162 



kallikrein 2; prostatic 
fetal Alzheimer antigen 
ESTs 

H2B htetone family; member Q 
ESTs 
ESTs 

ESTs; Moderately similar to KIAA0273 [H. 
Homo sapiens BAG clone RG041D11 from 7q2 10.7 

Homo sapiens mRNA for JM27 protein; comp 10.6 

ESTs 10.6 

ESTs; Weakly similarto polymerase [H.sa 105 

ESTs 102 

ESTs 10.1 

ESTs 10.1 

ESTs 10 

ESTs 99 

H2A histone family; member A 9.8 

ESTs 9.7 

ESTs 9.7 

T-cell receptor; gamma cluster 9.4 

ESTs 92 

ESTs 9 

ESTs 8.9 

ESTs 8.8 

phosphoiipase A2; group HA (platelets; 8.7 

yz61c5.s1 Scares muItiple_sclerosis_2NbH 8.5 

ESTs 82 

ESTs; Weakly similar to KIAA0319 [Ksapl 8.1 

ESTs 8 

CD38 antigen (p45) 8 

ESTs 7.6 

ESTs 7.4 

ESTs 7.1 

Human N33 protein form 1 (N33) gene, exo 7 

matrix metaOoproteinase 7 (matrilysin; 6.9 

pre-B-ceU leukemia transcription factor 6.8 

ESTs; WeaWysimilarto W01A6.c[C.etega 6.8 
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18.9 

18.6 
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17.3 
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109112 AA169379 Hs,257924 ESTs 6.8 

109795 F10707 Hs.326416 ESTs 6.7 

130336 X07730 Hs.171995 kalJikrein 3; (prostate specific antigen 6.6 

131425 AA219134 Hs.26691 ESTs 6.6 

5 132902 AA490969 Hs.59836 ESTs 6.6 

133724 U07919 Hs.75746 aldehyde dehydrogenase 6 6.5 

120215 241050 Hs,108787 Homo sapiens Mcd4p homolog mRNA; complet 65 

131681 AA010163 Hs.3383 upstream regulatory element binding prot 6.5 

100727 X07290 Hs.334766 Human HF.12 gene mRNA 6.3 

10 121770 M421714 Hs.278428 Homo sapiens mRNA for KIAA0896 protein; 6.3 

123475 AA599267 Hs.250528 ESTs; Weakly similar to ANKYRIN; BRAIN V 6.3 

133061 AB000584 Hs.296638 prostate differentiation factor 6.3 

116429 AA609710 Hs.279923 ESTs; Weakly similar to similar to GTP-b 6.2 

101233 L29008 Hs.878 sorbitol dehydrogenase 62 

15 104691 AA011176 Hs.37744 ESTs 65 
127248 AA325029 EST27953 CerebeUum i! Homo (sapiens CDNA6.2 

105500 AA256485 HS522399 ESTs 6.1 

130828 AA053400 Hs.203213 ESTs 5.9 

115357 AA281793 Hs.72988 ESTs 5.8 

20 116334 AA491457 Hs.48948 ESTs 5.7 

120132 238839 Hs.125019 ESTs; WeaWy similar to !H1 ALU SUBFAMl 5.6 

106375 AA443993 Hs£89072 ESTs 5.6 

124777 R41933 Hs.140237 ESTs; Weakly similar to neuronal thread 5.6 

101791 M83822 Hs.62354 Human beige-like protein (BGL) mRNA; par 5.5 

25 117698 N41002 Hs.45107 ESTs 5.5 

122041 AA431407 Hs.98732 Homo sapiens Chromosome 16 BAG clone CFT 5.5 

133723 AA088851 Hs.262476 S-adenosylmethionine decarboxylase 1 5.5 

113938 W81598 ESTs 5.4 

133015 AA047036 Hs246315 ESTs 5.4 

30 108186 AA056482 Hs.7780 ESTs 5.3 

104466 N25110 Hs.326392 Human guanine nucleotide exchange factor 5.3 

104033 AA365031 Hs.98944 ESTs 5.3 

110844 N31952 Hs.167531 ESTs;Weakfy similar to (defiine not ava 5.3 

129056 H70627 Hs.108336 ESTs; Weakly similar to !!!! ALU SUBFAMl 5.3 

35 133493 AA284143 Hs.194369 Homo sapiens chromosome 1 atrophin-1 rel 5.3 

129184 W26769 Hs.109201 ESTs; Highly similar to (defiine not ava 5.2 

101448 M21389 Hs.195850 keratin 5 (epidermolysis bullosa simplex 5.1 

116188 AA464728 Hs.184598 ESTs; Weakly similar to Hi! ALU SUBFAMl 5.1 

105921 AA402613 Hs.169119 ESTs 5.1 

40 103375 X91868 Hs54416 sine oculis homeobox (Drosophila) homolo 5.1 

128871 AA400271 Hs,106778 ESTs; Highly similar to (defiine not ava 5.1 

116238 AA479362 Hs.47144 ESTs 5 

102913 X07696 Hs.80342 keratin 15 5 

103011 X52541 Hs.326035 early growth response 1 5 

45 118981 N93839 H&39288 ESTs; Weakly simflar to l!i! ALU SUBFAMl 5 
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TABLE 2A shows the accession numbers for those primekeys lacking unigenelD's for Table 
2. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey 



CAT number Accession 



1 18417 37186J AF08Q229 AF080231 AF08Q230 AF080232 AF080233 AFQ8G234 BE550633 A1636743 AW614951 BE467547 AI680833 

A1633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 AI583718 AI672574 
N25695 AW665466 Ai818326 AA126128 AI480345 AW013827 AA248638 A1214968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW4968G8 AI080480 AI631703 AJ651023 AI867418 AW818140 AA5O2500 AI206199 A1671282 
AI352545 BE501 030 Al 652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 
BE46661 1 A1206344 AA574397 AA348354 A14931 92 
227560 J AA364195AA325029AW962050 

235652 J AI141999 AA730176 R44544 R41778 AW300793 AW966157 AA918501 AA599629 AI082195 AI198537 AW006520 

AW236663 AW151420 A1826987 AI810832 AI669102 AI201981 N27331 AA335566 T84S22 BE085347 BE085269 
entre*_U42359 U42359 



127248 
107033 

102398 
113938 



genbanleW81598W81598 
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TABLE 3: shows genes, including expression sequence tags, differentially expressed in 
5 prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos Hu02 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 
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15 



20 



25 



30 



35 
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45 
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55 



60 



65 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Ratio of tumor to norma! body tissue 



Pkey ExAccn UnigenelD Unigene TiUe 



00131 
00235 
00570 
00819 
01063 
01247 
01416 
01447 
01485 
01514 
01626 
01663 
01758 
01768 
01817 
01888 
02031 
02052 
02221 
02233 
02302 
02348 
02457 
02473 



D12485 
D29954 



Hs.11951 
Hs.13421 



HG2261-HT2352 
HG4G20-HT4290 
L00354 Hs.80247 



phosphodiesterase l/nucieotide pyrophosp 

KIAA0056 protein 

Hs.171995 

Hs.2387 



02696 
i02751 
I02823 



03043 
03093 
03376 
03401 
103613 
03677 
03962 
04084 
04257 
04301 



L33801 

M17254 

M21305 

M24736 

M28214 

M57399 

M60750 

M77836 

M81118 

M88163 

M99701 

U04898 

U07559 

U24576 

U26173 

U33052 

U37519 

U48807 

U49957 

U71207 

U75272 

U80034 

U90914 

X02544 

X54667 

X55733 

X60708 



04851 
04896 
04956 
04957 
04967 
05099 



X95240 

Z46629 

Z83806 

AA298160 

AA410529 

AF006265 

D45332 

AA025887 

AA040882 

AA054228 

AA074880 

M074919 

AA084506 

AA150776 



Hs.78802 
Hs.279477 

Hs.89546 

Hs.123072 

Hs.44 

Hs.2178 

Hs.79217 

HsJ8989 

Hs.152292 

Hs.95243 

HS2156 

Hs.505 

Hs.3844 

Hs.79334 

Hs.69171 

Hs.87539 

Hs2359 

Hs.180398 

Hs.29279 

Hs.1867 

Hs.68583 

Hs.5057 

Hs.572 

Hs.123114 

Hs.93379 

Hs.44926 

Hs.323378 

Hs.54431 

Hs.2316 

Hs.83243 

Hs.30732 

Hs.9222 

Hs.6783 

Hs.293943 

Hs.10290 

Hs.23165 

Hs.20509 

Hs.10026 

Hs.291000 

Hs.23729 

Hs.26369 



glycogen synthase kinase 3 beta 
v-ets avian erythroblastosis virus E26 o 
Human alpha sateinte and satellite 3 ju 
selectin E (endothelial adhesion molecui 
RAB3B; member RAS oncogene family 
pleiotrophin (heparin binding growth fac 
H2B histone family; member A 
pyrroiine-5-carboxyiate reductase 1 

SWi/SNF related; matrix associated; acti 
transcription elongation factor A (Sll)- 
RAR-related orphan receptor A 
ISL1 transcription factor; LfM/homeodoma 
UM domain only 4 
nuclear factor; interleukfn 3 regulated 
protein kinase C-like 2 
aldehyde dehydrogenase 8 
dual specificity phosphatase 4 
LIM domain-containing preferred transloc 
eyes absent (Drosophila) homoiog 2 
progastricsin (pepsinogen C) 



carboxypeptidaseD 
orosomucold 1 
cystatinS 

eukaryotic translation initiation factor 
dipeptidylpeptidase IV (CD26; adenosine 
coated vesicle membrane protein 
specific granule protein (28 kOa); cyste 
SRY (sex-determining region YH>ox 9 (ca 
H .sapiens mRNA for axonemal dyneln heavy 
ESTs 
ESTs 

estrogen receptor-binding firagment-assoc 
ESTs 

ESTs; Weakly similar to IIH ALU SUBFAMI 
U5 snRNP-specific40 kDa protein (hPrp8- 
ESTs 

ESTs; Weakly similar to hypothetical pro 
ESTs; Weakly similar to ORF YJL063C [S.c 
ESTs 

Homo sapiens done 24405 mRNAsequence 
ESTs 



R1 

6.3 
5.1 

Antigen, Prostate Specific, Alt Splice 
10.5 



8J5 

47 

4.7 

11 

9.8 

6.2 

8.4 

4.9 

5.4 

7.5 

5.6 

5.7 

132 

8.9 

5.6 

7.4 

8.2 

5.9 

5.1 

5.7 

9 

10.6 

15.6 

4.9 

22.6 

4.7 

4.9 

5.8 * 

5.2 

7.4 

52 

4.9 

6 

6.4 

6.8 

105 

6.3 

4.9 

5.8 

6.4 

4.8 

65 

7 

5.1 
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105304 AA233553 Hs.190325 ESTs 4.7 

105370 AA236476 Hs22791 ESTs; Weakly similar to transmembrane pr 10.3. 

105427 AA251330 Hs28248 ESTs 5 

105542 AA261858 Hs266957 ESTs; Weakly similar to heat shock prote 8.8 

5 105628 AA281251 Hs.79828 ESTs; Weakly similar to putative zinc fi 55 

105640 AA2B1623 Hs.6685 ESTs; Weakly similar to KIAA0742 protein 8 

105645 AA282138 Hs. 11325 ESTs 14 

105691 AA287097 Hs289068 transcription factor 4 6.3 

105730 AA292701 Hs.5364 DKFZP564I052 protein 4.9 

10 105808 AA393608 Hs286131 KIAA0438 gene product 7 

105826 AA398243 Hs.194477 EST s; Moderately simOar to similar to N 5 

105903 AA401433 Hs200016 ESTs; Weakly similar to diphosphoinosito 9.9 

105906 AA401633 Hs22380 ESTs 115 

106065 AA417558 Hs25206 ESTs 5.1 

15 106094 AA419461 Hs23317 ESTs 10.9 

106157 AA425367 Hs.34892 ESTs 6.6 

106184 AA426643 Hs.10762 ESTs 85 

106211 AA428240 Hs.126083 ESTs 8.4 

106213 AA428258 Hs5769 Homo sapiens mRNA; cDNA DKFZp564E153 (fr 5.7 

20 106272 AA432074 Hs.323099 ESTs 5.8 

106369 AA443828 Hs288856 ESTs 65 

106400 AA447621 Hs.94109 ESTs 5.4 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564CQ53 (fr 92 

106507 AA452584 Hs26781 9 .protein phosphatase 1; regulatory (hhib 5.6 

25 106523 AA453441 Hs.31511 ESTs 4.7 

106532 AA453626 Hs57443 ESTs 4.7 

106557 AA455087 Hs22247 ESTs 5.7 

106575 AA456039 Hs.105421 ESTs 72 

106618 AA459249 Hs5715 ESTs; Weakly similar to Similarity with 5.6 

30 106820 AA481Q37 Hs.12592 ESTs 5.4 

106846 AA485223 Hs.34892 ESTs 55 

106973 AA505141 Hs.11923 Human DNA sequence from clone 167A19 on 75 

107110 AA609952 Hs.12784 KIAA0293 protein 6.1 

107127 AA620504 Hs.179898 ESTs 7.1 

35 107159 AA621340 Hs.10600 ESTs; Weakly similar to ORFYKR081c[S.c 52 

107217 D51095 Hs.35861 DKFZP586E1621 protein 15.1 

107365 U78294 Hs.111256 arachidonate 15-Ttpoxygenase; second typ 4.7 

107630 AA007218 Hs.60178 ESTs 55 

107734 AA016225 Hs.7517 ESTs 45 

40 107760 AA018042 Hs252085 EST 7.6 

107997 AA037388 Hs52223 Human DNA sequence from clone 141H5 on c 105 

108012 AA039616 Hs.173334 ESTs 65 

108520 AA084138 Hs.46786 ESTs 7.9 

108583 AA088276 Hs58826 ESTs 5.6 

45 108613 AA100967 Hs.69165 ESTs 6 

108664 AA113349 Hs.69588 EST 65 

108677 AA115629 Hs.118531 ESTs 5.9 

108807 AA129968 Hs.49376 ESTs; WeaWy similar to PROTEIN PHOSPHAT 5.8 

108910 AA136590 ESTs 5 

50 108933 AA147224 Hs537232 ESTs 12.7 

108948 AA149579 Hs.118258 ESTs 65 

109014 AA156790 Hs262036 ESTs 155 

109124 AA171529 Hs.183887 ESTs 6.1 * 

109142 AA176438 Hs.41295 ESTs 5.1 

55 109277 AA196332 Hs.86043 ESTs 55 

109342 AA213620 Homo sapiens mRNA; cDNA DKFZp586M1418 (f6 

109562 F01811 Hs.187931 ESTs; Moderately similar to voltage-gate 105 

109565 FO1930 Hs23648 ESTs 7 

109648 FO4600 Hs.7154 ESTs 9.9 

60 109799 F10770 Hs.180378 Homo sapiens clone 669 unknown mRNA; com 6.4 

109859 H02308 Hs20792 ESTs 55 

110181 H20276 Hs.31742 ESTs 16.8 

110854 N32919 Hs27931 ESTs 10 

110924 N47938 Hs.12940 yy84a09.s1 Soares_muftiple^sclerosis^2Nb 55 

65 111046 N55514 Hs.318584 ESTs 65 

111091 N59858 Hs53032 Homo sapiens mRNA; cDNA DKFZp434N1 85 (fr 52 

111157 N66613 Ks59364 ESTs 5 

111164 N66857 Hs.122489 ESTs; Weakly similar to 111! ALU CLASS C 5.6 

111221 N68869 Hs.15119 ESTs 62 
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111348 N90041 Hs.9585 ESTs 5.4 

111353 N90430 Hs.6616 ESTs 5.3 

111495 R07210 Hs.9683 ESTs 5.8 

111540 R08850 Hs.9786 ESTs 6 

5 111579 R10657 Hs.167115 KIAA0830 protein 12.6 

111581 R10684 Hs.5794 ESTs 7,1 

111734 R25375 Hs.128749 ESTs 65 

111861 R37460 Hs.25231 ESTs 9.4 

111870 R37778 Hs.18685 ESTs; Weakly similar to hypothetical pro 6.5 

10 111937 R40431 Hs.14846 Homo sapiens mRNA;cDNADKFZp564D01 6 (fr 4.8 

111987 R42036 Hs.6763 KIAA0942 protein 6.4 

112184 R49173 Hs.330242 ESTs 5.6 

112286 R53765 Hs/158135 KIAA0981 protein 93 

112380 R59740 Hs.5740 ESTs 4.7 

15 112452 R63841 Hs.157461 ESTs 6 

112601 R79111 Hs.78225 annexinAI 5.4 

112753 R93696 Hs.169882 ESTs 5.8 

112902 T09262 Hs.129190 ESTs 5.1 

112984 T23457 Hs289014 ESTs 4.9 

20 113021 T23855 Hs.129836 KIAA1 028 protein 103 

113083 T40530 Hs266957 ESTs; Weakly similar to heat shock prate 57 

113200 T57773 Hs.10263 ESTs 73 

113494 T88878 Hs.86538 ESTs 8.7 

113849 W60439 Hs.8858 ESTs; Moderately similar to cbpt 46 [M.mu 4.9 

25 113883 W72382 Hs.11958 oxidative 3 alpha hydroxysteroid dehydro 4.7 

1 13950 W85765 Hs30504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 6.7 

113986 W87462 Hs21894 ESTs 55 

113989 W87544 Hs268828 ESTs 4.7 

114124 Z38595 Hs.125019 ESTs; Highly similar to KIAA0886 protein 213 

30 114340 Z41395 Hs.143611 ESTs 9.6 

114346 Z41450 Hs.130489 ESTs 52 

114435 AA018216 Hs.164975 Bicaudal D (Drosophifa) homokxj 1 7.4 

114463 AA025370 Hs.40109 KIAA0872 protein 82 

114652 M101416 Hs.107149 ESTs; Weakly similar to PTB-ASSOCIATED S 5.4 

35 114721 AA131450 Hs.103822 ESTs 43 

114730 AA133527 Hs331328 ESTs; Weakly similar to The KIAA01 38 gen 5.1 

114833 AA234362 Hs.87159 ESTs; Moderately similar to CGI-66 prote 5.5 

114860 AA235112 Hs.42179 ESTs; Moderately similar to similar to m 63 

114884 AA235811 Hs293672 ESTs 52 

40 114895 AA236177 Hs.76591 KIAA0887 protein 4.7 

114908 AA236545 Hs.54973 ESTs 52 

114932 AA242751 Hs.16218 K1AA0903 protein 5.7 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (fr 52 

115140 AA258030 Hs279938 ESTs; Weakly similar to supported by GEN 5.9 

45 115468 AA287061 Hs48499 ESTs; Highly similar to Bdelght protein 4.7 

115583 AA398913 Hs.45231 UX)C1 protein 7.6 

115709 AA412519 Hs58279 ESTs 4.8 

115772 AA423972 Hs.131740 ESTs 5 

115774 AA424029 Hs288390 ESTs; Moderately similar to dynamin; int 5.4 

50 115776 AA424038 Hs31897 ESTs 5 

115821 AA427528 Hs.130965 ESTs; Weakly similar to ZINC FINGER PROT 13.7 

115955 AA446121 Hs.44198 Homo sapiens BAC done RG054D04 from 7q3 10.6 

116024 AA451748 Hs.83883 Human DNA sequence from clone 71 8J7 one 6.8 - 

116108 AA457566 Hs28777 ESTs 6 

55 116117 AA459117 Hs31575 SEC63; endoplasmic reticulum translocon 73 

116146 AA460701 Hs.15423 ESTs 55 

116296 AA489033 Hs.62601 Homo sapiens mRNA; cDNA DKFZp586K1318 (f 5.7 

116379 AA521472 Hs.71252 ESTs 5.9 

116393 AA599463 Hs306051 protein phosphatase 2 (formerly 2A);reg 5.9 

60 116401 AA599963 Hs.59698 ESTs 7.9 

116416 AA609219 Hs.39982 ESTs 92 

116587 D59325 Hs.121429 ESTs 52 

116601 D80055 Hs.45140 ESTs 4.9 

116684 F09156 Hs.66095 ESTs 72 

65 116722 F13654 HSFIH32 Stratagene catH937212 (1992) Horn 5.5 

116766 H13260 Hs.95097 ESTs 5.9 

1 17453 N29568 Hs.108319 thyroid hormone receptor-associated prat 6.9 

117557 N33920 Hs.44532 diubiquitin 43 

117708 N45114 Hs.126280 ESTs 63 
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118001 N52151 Hs.47447 ESTs 11.4 

118229 N62339 Hs.166254 heal shock 90kD protein 1; alpha 62 

118599 N69207 Hs203697 ESTs 5.8 

118645 N70358 Hs.1 251 80 growth hormone receptor 7.1 

5 118873 N89881 Hs.44577 ESTs 6 

118985 N94303 Hs.55028 ESTs 9.3 

119107 R42424 Hs.63841 ESTs 6 

119126 R45175 Hs.117183 ESTs 17j9 

119271 T16387 Hs.65326 ESTs 6 

10 119367 T78324 Hs250895 ESTs 5 

119721 W69440 Hs.48376 ESTs 15.4 

119741 W70205 Hs.43670 Wnesin family member 3A 10.1 

119780 W72967 Hs.191381 ESTs; Weakly similar to hypothetical pro 5.3 

120217 Z41078 Hs.66035 ESTs 43 

15 120266 AA173939 Hs2Q5442 ESTs; Weakly similar to inner centromere 83 

120294 AA190888 Hs.153881 ESTs; Highly similar to NY-REN-62 antige 4.9 

120418 AA236010 Hs26613 Homo sapiens mRNA; cDNA DKFZp586F1323 (f 4.7 

120466 AA253400 Hs.137569 tumor protein 63 kDa with strong homolog 5.6 

120524 AA261852 Hs.192905 ESTs 4.9 

20 120571 AA280738 Hs34892 ESTs 83 

120596 AA282074 HS237323 ESTs 62 

120713 AA292655 Hs36557 ESTs 9.9 

120992 AA398246 H&97594 ESTs 16.4 

121429 AA406293 Hs.41167 ESTs 6.9 

25 121503 AA412049 Hs290347 ESTs 73 

121512 AA412105 Hs.193736 ESTs 53 

121816 AA424814 HsA8827 ESTs 4.6 

122027 AA431302 Hs38721 EST; Weakly similar to N-copine [H.sapie 53 

122294 AA437311 Hs.98927 ESTs 5.7 

30 122411 AA446859 HS39083 ESTs 63 

122791 AA460158 Hs.129836 KIAA1 028 protein 124 

122792 AA460225 Hs.99519 ESTs 5.1 
122969 AA478539 Hs.104336 ESTs 4.9 
123095 AA485724 HS27413 ESTs 5.4 

35 123100 AA485957 Hs306219 Homo sapiens clone 25032 mRNA sequence 5 

123295 AA495981 Hs.250830 ESTs 4.7 

123311 AM96252 Hs.105069 ESTs 7A 

123583 AA609006 Hs.1 11240 ESTs 9.1 

123619 AA609200 ESTs 4J 

40 123645 AA609310 Hs.188691 ESTs 43 

123709 AA609651 Hs.112742 ESTs 7 

123968 C14333 Hs.108327 damage-specific DNA binding protein 1 (1 5 

124178 H45996 Hs.97101 putative G protein-coupted receptor 63 

124352 N21626 Hs.102406 ESTs 102 

45 124357 N22401 yw37g07.s1 Morton Fetal Cochlea Homo sap 103 

124515 N58172 Hs.109370 ESTs 142 

124911 R88992 Hs.174195 ESTs 43 

125154 W38419 ESTs 4.7 

125992 W01626 za36e07j1 Soares fetal liver spleen 1NF 5.1 

50 126802 AA947601 Hs.97056 ESTs 5.1 

126812 Z36290 Hs.173933 ESTs; Weakly similar to NUCLEAR FACTOR 1 4.6 

127080 AA662913 Hs.190173 ESTs 5 

127308 AA507628 Hs.334390 ESTs 43 - 

127370 AKJ24352 Hs.70337 immunoglobulin superfamily; member 4 4.7 

55 127386 AI457411 Hs.106728 ESTs 4.8 

127965 AA828760 Hs292059 ESTs 43 

128172 A1400862 Hs265130 ESTs 5 

128305 AI039722 Hs279009 ESTs 53 

128420 A1088155 Hs.41296 ESTs; Weakly similar to unknown [H.sapie 17 

60 128467 AA176446 Hs.1 80428 ESTs; Weakly similar to hypothetical 43. 43 

128610 L38608 Hs.10247 activated leuoocyte cell adhesion motecu 7.9 

128625 AA242816 Hs.102652 ESTs; WeaWy similar to KIAA0437 [H^apl 8.1 

128651 AA446990 Hs.103135 ESTs 63 

129088 AA215971 Hs.194431 KIAA0992 protein 52 

65 129136 N26391 Hs250723 ESTs 5.1 

129171 AA234048 Hs.7753 calumenin 53 

129229 AA211941 Hs.109643 polyadenylate binding proteln-interactin 53 

129386 N27524 Hs260024 Cdc42 effector protein 3 52 

129467 AA410311 Hs.44208 ESTs 5.1 
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129564 H22136 Hs.75295 guanylate cyclase 1 ; soluble; afcha 3 16.3 

129699 AA458578 Hs.12017 KIAA0439 protein; homolog of yeast ublqu 9.2 

129821 F11019 Hs.12696 cortactin SH3 domain-binding protein 8.6 

129823 X00946 Hs.1 05314 relaxfn 2 (H2) 9.1 

5 129847 W46767 Hs.296178 ESTs; Weakly similar to RNA POLYMERASE 1 5.4 

129912 AA047344 Hs.107213 ESTs; Highly similar to NY-REN-6 antigen 63 

129958 L20591 Hs.1378 annexinA3 5.1 

129977 J04076 Hs.1395 earfy growth response 2 (Krox-20 (Drosop 8.6 

130061 U82256 Hs.172851 arginase; type II 7.4 

10 130241 U78313 Hs.153203 MyoD family inhibitor 4.9 

130466 N21679 Hs.180059 ESTs 5.8 

130541 X05608 Hs.21 1584 neurofilament; light polypeptide (68kD) 6.7 

130619 AA477739 Hs.12532 ESTs 6.4 

130925 N71935 Hs.169378 muftiple PDZ domain protein 7.9 

15 130938 AA013250 H&21398 ESTs; Moderately similar to PUTATIVE 6LU 62 

130971 H20332 Hs.301444 signal sequence receptor; gamma (transio 6.4 

131066 F09006 Hs.22588 ESTs 5 

131126 FO9012 Hs/1 81326 myotubularin related protein 2 6.4 

131310 J02960 Hs.2551 adrenergic; beta-2-; receptor; surface 7.9 

20 131487 AA253220 Ha27373 Homo sapiens mRNA;cDNADKFZp56401763(f 5.9 

131561 X59841 H&294101 pre-B-cetl leukemia transcription factor 7.6 

131562 U90551 Ha28777 H2A histone family; member L 5.1 
131579 N62922 Hs^9088 ESTs 11 
131629 AA442119 H&238809 ESTs 4.9 

25 131682 AA428368 Hs.30654 ESTs 4.8 

131699 R68657 Hs.90421 ESTs; Moderately similar to II!! ALU SUB 6.5 

131795 N32724 Hs.32317 Sox-like transcriptional factor 5.6 

132053 H93381 H&38085 ESTs; Weakly sMtor to putative glycine 12 

132122 U65092 Hs.40403 Cbp/p300-interacting transactivator; wit 5.6 

30 * 132191 AA449431 H&288361 KIAA0741 gene product 8 

132256 AA608856 Hs.431 murine leukemia viral (bmM) oncogene h 5.5 

132482 AA429478 Hs.238126 ESTs; Highly similar to CGI-49 protein [ 6.6 

132533 AA021608 Hs.172510 ESTs 5.8 

132572 AA448297 Hs^37825 signal recognition particle 72kD 6.2 

35 132581 R42266 Hs.52256 ESTs; Weakly similar to beta-TrCP protei 16 

132700 N47109 Hs.5521 ESTs 6.8 

132701 AA279359 Hs£5220 BCL2-assoclated alhanogene 2 5.3 
132725 L41887 Hs.184167 splicing factor; arginine/serine-rich 7 7.8 
132783 N74897 H&278894 DEAD/H(Asp-Glu-AIa-Asp/His)boxpolypep 5.9 

40 132790 X75535 Hs.168670 peroxisomal famesylated protein 8 

132939 U76189 Hs.61152 exostoses <mulrjple)-like 2 52 

133142 F03321 Hs.65874 ESTs 52 

133342 U29589 Hs.7138 cholinergic receptor; muscarinic 3 10.3 

133434 AA278852 Hsi)212 ESTs 5.8 

45 133453 M68941 Hs.73826 protein tyrosine phosphatase; non-recept 4.9 

133520 X74331 H&74519 primase; polypeptide 2Af58kD) 13.1 

133544 T33873 Hs.74624 protein tyrosine phosphatase; feceptor t 4.6 

133608 013315 Hs.75207 glyoxalasel 4.8 

133626 H75939 Hs.75277 Homo sapiens mRNA; cDNA DKFZp586M141 (fr 5 

50 133633 D21262 Hs.75337 nucleolar phosphoprotein p1 30 63 

133797 S66431 Hs.76272 retinoblastoma-binding protein 2 6 

133928 N34096 Hs.7766 ubiquitin-conjugating enzyme E2E 1 (homo 5.4 

134095 U47414 Hs.79069 cydlnG2 52 - 

134249 N89827 Hs.80667 RALBP1 associated Eps domain containing 6.5 

55 134321 AA418230 Hs.8172 ESTs 7 

134453 X70683 Hs.83484 SRY (sex determining region Y)-box 4 4.7 

134542 X57025 Ks.85112 insulin-like growth factor 1 (somatomed! 77 

134570 U66615 Hs.172280 SW1/SNF related; matrix associated; a cti 6.4 

134592 U82613 Hs.289104 Alu-binding protein with zinc finger dom 5.4 

60 134654 W23625 Hs.8739 ESTs; Weakly similar to ORF YGR200C [S.c 5 

134666 AA482319 Hs.8752 putative type II membrane protein 5.4 

134806 Z49099 Hs.89718 spermine synthase 6.7 

134951 AA431480 Hs.169358 ESTs 9.8 

135066 X04602 Hs.93913 Interieukin 6 (Interferon; beta 2) 5.7 

65 135155 AA358268 Hs.1 66556 ESTs; Moderately similar to transcriptio 4.9 

135411 L10333 Hs.99947 retfculon 1 53 

300023 M10098 AFFX control: 18S ribosomai RNA 4.6 

300254 AW079607 Hs.55610 ESTs; Weakly similar to ZnT-3 [H^apiens 7.8 

300273 AW013907 Hs.167531 ESTs; Moderately similar to predicted us 115 
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300319 AW157646 Hs.153506 ESTs; Weakty similar to microtubute-adi 85 

300566 H86709 Hs.326392 son of sevenless (Drosophila) homolog 1 5.8 

300578 AI989417 Hs.134289 ESTs 4.4 

300671 AI239706 Hs.93810 ESTs 7.9 

5 300675 AA039352 Hs.125034 ESTs; Weakly similar to ORF YDL040C [S.c 4.5 . 

300680 AW468066 Hs24817 ESTs; Weakly similar to KIAA0986 protein 5.2 

300762 AI497778 Hs^Q509 ESTs 6.4 

300810 AKJ76890 Hs.146847 ESTs 5.8 

300813 AA406411 Hs.208341 ESTs; Weakly similar to KIAA0989 protein 10.6 

10 300823 A1863068 Hs.106823 ESTs; Weakly similar to putative zinc fi 5.6 

300834 AF109300 Hs.147924 ESTs 6.7 

300923 AW136372 Hs.1852 ESTs 7.6 

300962 AA593373 Hs.293744 ESTs 55 

301015 AA947682 Hs.20252 ESTs; Weakly similar to Chain A; Cdo42hs 7 

15 301042 AI659131 Hs.197733 ESTs 24.9 

301242 AW161535 Hs.23782 ESTs 11-8 

301254 A1049624 Hs.283390 EST diister (not in UniGene) with exon h 4.3 

301262 H29500 Hs.7130 ESTs; Moderately similar to N-copine [H. 4.3 

301388 AA156879 Hs.262036 ESTs; Weakly similar to ZINC FINGER PROT 6.6 

20 301563 AI802946 Hs.44208 ESTs; Weakly similar to match to ESTs AA 5.7 

301656 AW008475 Hs.151258 EST cluster (not in UniGene) with exon h 6,8 

301689 Z44810 Hs501789 ESTs; WeaWy similar to similar to C.ele 6.3 

301783 AL046347 Hs53937 Homo sapiens PAC done DJ1 159004 from 7p 62 

301805 A1800004 Hs.142846 ESTs; Weakly similar to MesP1 [Mjnusculu 85 

25 301846 R200Q2 Hs5823 ESTs; WeaWy similar to intrinsic tactor 4.6 

301891 AF131855 Hs.279591 Homo sapiens done 25056 mRNA sequence 6.3 

302005 A1869666 Hs.123119 ESTs 365 

302056 AI457532 Hs.30488 ESTs; Moderately similar to ROSA26 AS [M. 95 

302067 H05698 Hs.222399 ESTs; Weakly similar to protein-tyrosine 5.8 

30 302099 AL021397 Hs.137576 ribosomal protein L34 pseudogene 1 8.8 

302147 AB022660 Hs.151717 KIAA0437 protein 5.9 

302214 AJ001454 Hs.159425 Homo sapiens mRNA for testican-3 4.3 

302236 AI128606 Hs.6557 zinc finger protein 161 4.3 

302358 D81150 Hs522848 EST duster (not in UniGene) with exon h 5.5 

35 302410 NM_004917 Hs.218366 EST cluster (not in UniGene) with exon h 265 

302486 AC003682 Hs.183512 multiple UniGene matches 82 

302582 NM_000522 Hs.249195 EST duster (not in UniGene) with exon h 6.4 

302765 AA425562 Hs.11065 EST duster (not in UniGene) with exon h 5 

302792 AA3436S6 Hs.46821 ESTs; Weakly similar to putative [H ,sapi 4.8 

40 302881 AA508353 Hs.105314 relaxin 1 (H1) 783 

302692 N58545 Hs.42346 histone deacetylase 3 85 

302970 AW1 18352 Hs.312679 EST duster (not In UniGene) with exon h 7 A 

302977 AW263124 Hs.3151 11 EST duster (not in UniGene) with exon h 55 

303029 AF199613 EST cluster (not In UniGene) with exon h 4.6 

45 303125 AF161352 Hs.111782 EST cluster (not in UniGene) with exon h 5.8 

303280 AI571580 Hs.170307 ESTs 4.3 

303306 AA215297 Hs.61441 EST duster (not in UniGene) with exon h 6.4 

303309 AL134164 Hs.145416 ESTs 65 

303344 AA255977 Hs£50646 ESTs; Highly similar to ubiquitin-coniug 195 

50 303380 AA298471 Hs526567 EST duster (not In UniGene) with exon h 6.6 

303401 AA758552 HS509497 ESTs 6.8 

303525 AW516519 Hs.273294 ESTs 45 

303526 AA348111 Hs.96900 ESTs 12.1 - 
303540 AA355607 Hs.309490 ESTs; Weakly similar to MMSET type I [H. 82 

55 303572 AW338520 Hs^42540 ESTs 8.4 

303685 AW50O106 Hs.23643 EST duster (not in UniGene) with exon h 4.9 

303699 D30891 Hs.19525 EST duster (not m UniGene) with exon h 15.7 

303702 AW500748 Hs.224961 ESTs; Weakly similar to 73 kDAsubunito 6.3 

303718 A1741397 Hs.1 14658 ESTs 4.6 

60 303722 AA521510 Hs.145010 ESTs 125 

303732 AW502405 Hs.125759 ESTs; Weakly similar to tumor suppressor 4.3 

303735 AA707750 Hs.169055 ESTs; Weakly similar to cts-Golgi matrix 5.4 

303752 AI017286 Hs5957 EST duster (not m UniGene) with exon h 5.3 

303753 AW503733 Hs.9414 ESTs 13 
65 303813 AI275850 Hs.1 14558 EST duster (not in UniGene) with exon h 7.8 

304053 R00493 Hs.125565 translocase of inner mitochondrial membr 45 

304218 N66373 Hs£7973 ESTs; Weakly similar to ZK354.7 [Celega 6 

305200 AA668128 Hs.45207 EST singleton (not In UniGene) with exon 5.7 

306716 AI024916 Hs*51354 ESTs 5.7 
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313615 AW295194 Hs.301997 DKFZP434N126 protein 5-2 

313625 AW466402 Hs.254020 ESTs 7J6 

313634 AA688292 Hs.337786 ESTs 44 

313635 AA507227 Hs.6390 ESTs 8.1 
5 313638 AI753075 Hs.104627 ESTs 6.7 

313670 C16690 Hs.23767 EST cluster (not In UniGene) 4.4 

313671 W49823 Hs.104613 ESTs 4.4 
313676 AA861697 Hs.120591 EST cluster (not in UniGene) 13.4 
313703 AI161293 Hs.280380 ESTs; Weakly similar to K1AA0525 protein 10 

10 313712 AA768553 Hs.74170 ESTs 5.2 

313800 AW296132 Hs.55098 ESTs 5.4 

313979 A1535B95 Hs.221024 ESTs 4.3 

314121 AI732100 Hs.187619 ESTs 13.6 

314123 AW245993 Hs.223394 ESTs 6.4 

15 314171 A1821895 Hs.193481 ESTs 29.4 

314188 AL138431 Hs.164243 ESTs 4.6 

314219 AL036001 Hs.48376 ESTs 5.7 

314236 AA743396 Hs.189023 ESTs 4.9 

314237 AA732359 Hs.96264 ESTs 4.4 
20 314284 AA731431 H&293464 EST duster (not In UniGene) 64 

314305 A1280112 Hs.125232 ESTs 5.3 

314343 AI754701 H&328476 ESTs; Weakly simitar to alternatively sp 62 

314530 AI052358 Hs.193726 ESTs 45 

314691 AW207206 Hs.136319 ESTs 17 

25 314695 AW502698 Hs.1 18152 ESTs 8.9 

314785 AI538226 Hs.32976 ESTs 9.4 

314801 AA481Q27 Hs.109045 ESTs; Weakly similar to ORFYGR245c[S.c 8 

314864 AA493811 Hs.294068 ESTs 6 

314907 AI672225 Hs222886 ESTs 19.3 

30 314916 AA548906 Hs.122244 ESTs 4.5 

314954 AA521381 Hs.187726 ESTs 5.3 

314981 AA524953 Hs.293334 ESTs 4.6 

315021 AA533447 Hs.312989 EST cluster (not in UniGene) 5.1 

315051 AW292425 Hs.163484 EST 155 

35 315052 AA876910 Hs.134427 ESTs 20 

315073 AW452948 Hs257631 ESTs 5.3 

315084 AI821085 ESTs 8.2 

315214 AI915927 Hs.34771 ESTs 54 

315220 AI420753 Hs.66731 ESTs 5.1 

40 315278 AI985544 Hs.12450 ESTs 5.8 

315282 A1222165 Hs.144923 ESTs 45 

315368 AW291563 Hs.104696 ESTs 6 

315369 AA764916 Hs.256531 ESTs 4.8 
315378 A1263393 Hs.145008 ESTs 62 

45 315379 AI378329 Hs.126629 ESTs 54 

315402 AW293424 Hs.75354 ESTs 5.1 

315442 AA977935 Hs.127274 ESTs 6.6 

315443 AW003416 Hs.160604 ESTs 5.5 
315528 R37257 Hs.184780 ESTs 8.1 

50 315593 AW198103 Hs.158154 ESTs 9.9 

315634 AA837085 Hs.220585 ESTs 7.8 

315705 AW449285 Hs.313636 ESTs 8.9 

315707 AI418055 Hs.181160 ESTs 5.1 - 

315714 AA744015 Hs.298138 EST cluster (not in UniGene) 6.1 

55 315740 T05558 Hs.156880 EST cluster (not In UniGene) 6.8 

315762 AI391470 Hs.158618 ESTs 5.3 

315769 AA744875 Hs.189413 ESTs 5 

315843 AA679430 Hs.191897 ESTs 5.7 

315990 AI800041 Hs.190555 ESTs 92 

60 316012 AA764950 Hs.1 19898 ESTs 4.3 

316036 AA708016 Hs.190389 ESTs 5.9 

316055 AA693880 Hs.6947 EST cluster (not in UniGene) 6.7 

316074 AW517542 Hs.293273 ESTs 55 

316100 AW203986 Hs2 13003 ESTs 5.1 

65 316169 A1127483 Hs.120451 ESTs 82 

316442 AA760894 Hs.153023 ESTs 17.1 

316491 AA766025 Hs.186854 EST 4.6 

316504 AW135854 Hs.132458 ESTs 4.3 

316667 AW015940 Hs232234 ESTs 7.6 
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316854 AA831215 Hs/159066 ESTs; Weakly similar to predicted using 5.1 

316905 AW138241 HS210846 ESTs 6.4 

317008 AW051597 Hs.143707 ESTs 4.4 

317019 AA864968 Hs.127699 ESTs 11 

5 317194 AW445167 Hs.126036 ESTs 135 

317224 D56760 Hs.93029 ESTs 87 

317404 AI806867 Hs.126594 ESTs 6.7 

317501 AA931245 Hs.137097 ESTs 11.1 

317548 AI654187 Hs.195704 ESTs 142 

10 317651 AW292779 Hs.169799 ESTs 5.8 

317758 AI733277 Hs.128321 ESTs 5.4 

317850 N29974 Hs.152982 EST cluster (not in UniGene) 11.4 

317869 AW295164 Hs.129142 ESTs; Weakly similar to DEOXYRIBONUCLEAS 13.6 

317902 AI828602 Hs211265 ESTs 5.3 

15 317916 AI565071 Hs.159983 ESTs 7.7 

318239 AJ085198 Hs.164226 ESTs 13.1 

318268 A1817736 H&182490 ESTs 62 

318327 AW294013 Hs200942 ESTs 4.6 

318363 R45530 Hs.1440 gamrra-amInobutyricacid{GABA) Arecepto 6 

20 318428 AI949409 Hs.194591 ESTs 123 

318464 AI151010 Hs.157774 ESTs 4.3 

318524 AW291511 Hs.159066 ESTs 25.9 

318540 T30280 Hs274803 EST cluster (not In UniGene) 7 

318591 AW206806 Hs.1 15325 ESTs 4.8 

25 318615 AI133617 Hs.10177 ESTs 5.5 

318646 AW175665 Hs278695 ESTs 5.7 

318667 AI493742 Hs.165210 ESTs 11 

318668 W26276 Hs.136075 ESTs 5.9 
318753 AA578265 Hs.7130 copinelV 55 

30 319080 Z45131 Hs23Q23 ESTs 16.9 

319181 F06504 Hs27384 EST cluster (not in UniGene) 4.6 

319191 AF071538 Hs.79414 prostate epithelium-specific Ets transcr 6.6 

319233 R21054 Hs.180532 ESTs 4.9 

319586 D78808 Hs283683 ESTs 82 

35 319750 AA621606 Hs.117956 ESTs 9.3 

319763 AA460775 Hs.6295 ESTs 14.3 

319824 AA424266 Hs.123642 EST duster (not to UniGene) 12.8 

319838 AA337642 Hs.95262 nuclear factor related to kappa Bbindin 5.1 

319913 AA179304 Hs271586 ESTs; Moderately similar to HI! ALU SUB 43 

40 319964 T80579 H&29G270 ESTs 53 

320076 AI653733 Hs271593 ESTs 85 

320102 AW296219 Hs.1 15325 RAB7; member RAS oncogene family-like 1 9.8 

320187 T99949 Hs3Q3428 EST duster (not in UniGene) 93 

320211 AL039402 Hs.125783 DEME-6 protein 7.9 

45 320324 AF071202 Hs.139336 ATP-binding cassette; sub-family C (CFTR 562 

320455 R49889 Hs24144 EST duster (not in UniGene) 8.3 

320464 AI089817 H&237146 ESTs 5.4 

320561 NM_006953 Hs.159330 EST duster (not in UniGene) 7 

320574 AL049443 Hs.161283 Homo sapiens mRNA; cONA DKFZp586N2020 (f 4.4 

50 320576 AL049977 Hs.1 62209 Homo sapiens mRNA; cDNA DKFZp564C122 (fr 6.7 

320654 AW263086 Hs.1 18112 ESTs 6 

320796 AFQ38966 H&31218 secretory carrier membrane protein 1 135 

320800 AI681006 Hs.71721 ESTs 62 - 

320813 AW360847 Hs.16578 ESTs 9.3 * 

55 320853 AI473796 Hs.135904 ESTs 8.1 

320856 D59945 Hs.65366 EST duster (not in UniGene) 6 

320899 AA633772 Hs.1 16798 ESTs 92 

320918 AW195012 Hs293970 ESTs 5 

320973 H19732 . HS247917 ESTs 5.9 

60 321099 AA018386 Hs.64341 ESTs 4.6 

321190 H52462 Hs,163872 EST duster (not In UniGene) 53 

321318 AB033041 Hs.137507 EST duster (not in UniGene) 8.4 

321382 AW372449 Hs.175982 EST duster (not in UniGene) 73 

321441 AW297633 Hs.118498 ESTs 14.7 

65 321538 H80483 Hs.46903 EST duster (not in UniGene) 92 

321609 H86Q21 Hs.182538 ESTs;WeaWysimnartohMmTRA1b(H.sapl 4.8 

321636 AI791838 Hs.193465 ESTs 55 

321638 AI356352 Hs.108932 ESTs 4.6 

321644 A1204177 Hs237396 ESTs 6.6 
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324676 
324691 
324696 
324713 
324715 
324718 
324720 
324752 
324753 
324790 
324801 
324804 
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326816 
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U31382 
U39840 
AA319514 
AAD37415 
AA056557 
AA102571 
AA121140 
AA167269 
AA252033 
AA281092 
AA449677 
AA450200 
AA479114 
D60374 
AA149579 
H01458 
H20826 
N24619 
R36671 
R51361 
R82331 
T64447 
AA262999 
AA278355 
AA287662 
AA400596 
AA416979 
AA454543 
F10802 
H77381 
N21680 
N27154 
N32912 
N34357 
N62780 
N92352 



331811 AA404500 



ESTs 

Hs.129179 ESTs 
Hs.1 12451 ESTs 

Hs.293341 ESTs;WeaWysimflartoPn>a2(Xl)[H.sa 
Hs.257339 ESTs 
Hs.163440 ESTs 

Hs.131798 EST duster (not in UniGene) 
Hs.1 16467 ESTs 
HS392437 ESTs 

Hs272072 ESTs;ModeratelyslmilartoH!!ALUSUB 
Hs.144871 EST cluster (not in UniGene) 
Hs.159337 ESTs 
Hs.14553 ESTs 
ESTs 

Hs.337533 ESTs 
Hs.136102 KIAA0853 protein 
Hs.125350 ESTs 

EST cluster (not in UniGene) 
ESTs 

CH^0_hsgIj6552458 
CH-21JtsglI5867660 
CHJ21_hs Q96682516 
CH.07Jisgi|5868455 
CHJLhsgi|5868837 
CH.16_p2gi|6165201 
CH.16_p2gi|5091594 
CH.16_p2gi|6671887 
CH.05_p2gi|6013592 
androgen receptor (dihydrotestosterone r 
Hs321110 

Hs.299867 guanine nucleotide binding protein 4 

hepatocyte nuclear factor 3; alpha 
Hs.30732 ESTs 
Hs.20999 ESTs 
Hs.6759 ESTs 
Hs.157078 ESTs 

Hs.177576 ESTs; Moderately similar to kynurenine a 
Hs.52620 ESTs 

Hs£4052 ESTs; Weakly similar to IUI ALU SUBFAMI 
Hs35254 ESTs 

Hs.15251 Human DNA sequence from clone 437M21 on 
Hs.143187 FK506-bindtng protein 3 (25kD) 
Hs.11356 ESTs 
EST 

Hs.91202 ESTs 
Hs.142896 ESTs 
Hs.315181 ESTs 
Hs.108920 ESTs 
Hs.14846 ESTs 
Hs.268714 ESTs 
Hs£68838 ESTs 
Hs.168439 ESTs 
Hs.300141 ESTs 
Hs.87929 ESTs 
Hs.1 18630 ESTs 
Hs.88143 ESTs 
Hs.81897 ESTs 
Hs.43543 ESTs 

Hs.237339 ESTs; Moderately similar to fill ALU SUB 
Hs.41223 ESTs 
Hs.43455 ESTs 
Hs.44076 ESTs 

Hs.291039 ESTs;WeaWysimilartohypoth8ticaJ43. 
Hs.93817 ESTs 
Hs.48703 ESTs 
Hs5472 ESTs 
Hs.334305 ESTs 
Hs.65949 KIAA0888 protein 
Hs.187958 ESTs 
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331848 AA417039 Hs.98268 signal recognition particle 72 kD 75 

331873 AA429445 Hs.98640 ESTs 65 

331889 AA431407 Hs.98802 Homo sapiens Chromosome 16 BAC clone CIT 335 

331967 AA460158 Hs.99589 KIAA1028 protein 6.8 

5 331974 AA464518 Hs.105322 ESTs 55 

332043 AA490831 Hs.201591 ESTs 105 

332076 AA599477 HS-291156 ESTs 4.4 

332173 F09281 Hs.100725 ESTs 55 

332247 N58172 ESTs 14.2 

10 332249 N62096 Hs.194140 ESTs 72 

332325 T79428 Hs.339667 ESTs 5.6 

332396 AA340504 ESTs; Weakly similar to similarto human 21.2 

332434 N75542 Hs.237731 transcription factor 4 155 

332493 N95495 Hs56729 ESTs; Highly similar to GTP-bindlng prot 7.1 

IS 332522 L38503 Hs.1 78357 glutathione S-transf erase theta 2 6.6 

332526 AA281753 Hs.17731 inositol 1 ^triphosphate receptor, ty 5.8 

332530 M31682 Hs.19280 inhtbtn; beta B (actMn AB beta polypep 55 

332533 M99487 Hs.325825 folate hydrolase (prostate-specific memb 38.1 

332538 N48715 Hs.20991 ESTs 65 

20 332546 D84454 Hs__587 solute carrier family 35 (UDP-galactose 4.8 

332594 AA279313 Hs52951 methyl CpG binding protein 2 55 

332610 AA412405 Hs.40513 ESTs; Weakly similar to BETA GALACTOSIDA 5.6 

332661 N95742 Hs.6390 ESTs 6.9 

332697 T94885 Hs.75725 carboxypeptidase E 245 

25 332712 D26070 Hs.79306 Inositol 1 ^triphosphate receptor; ty 9.9 

332716 L00058 Hs.79630 v-myc avian myelocytomatosis viral oncog 55 

332726 R72029 Hs.83428 synaptophysin-like protein 5 

332781 AA233258 ESTs; Weakly similar to D 10075 [C.etega 45 

332797 CH22.R3ENES.6_2 305 

30 332798 CH22J=GENES5_5 665 

332799 CH22J=GENES5jB 195 

332933 CH22_FGENES.38_7 55 

332980 CH22_FGENES54_1 55 

332984 CH22_FGENES54_6 4.9 

35 333168 CH22_.FGENES.94J 4.7 

333169 CH22_.FGENES.94_2 4.4 

333452 CH22LFGENES.1 57 J 45 

333456 CH22_FGENES.157_5 4.3 

333458 CH22_FGENES.157_7 4.6 

40 333611 CH22_FGENES.217_6 4.7 

333621 CH22_FGENES.219_5 55 

333814 CH22_FGENES.282_2 7.1 

333849 CH22_FGENES.290_8 62 

333949 CH22_FGENES.3G3_5 45 

45 333951 CH22J=GENES.303_7 4.9 

333955 CH22J=GENES.3Q3J1 5.6 

334150 CH22_FGENES.339J 5.1 

334223 CH22 FGENES560_4 20.3 

334297 CH22_FGENES572_3 9.4 

50 334443 CH22_FGENES587_2 4.6 

334444 CH22J=GENES587_4 5.6 

334447 CH22_FGENES.387_7 13.1 

334570 CH22_FGENES.405_11 5.4 - 

334749 CH22_FGENES.427J 5.3 

55 334777 CH22_FGENES.430J9 4.7 

334960 CH2^FGENES.465_29 52 

335179 CH22_FGENES504_9 8.8 

335293 CH22JH_ENES527_6 4.7 

335550 CH22_FGENES576_11 5.1 

60 335581 CH22_FGENES.581_19 5.7 

335586 CH22_FGENES581_25 45 

335809 CH22J=GENES.617_6 6_ 

335810 CH22_FGENES.617_7 55 
335B22 CH22_FGENES.619_7 7.1 

65 335824 CH22.FGENES.619J1 85 

335653 CH22_FGENES526_5 4.3 

335886 CH22_FGENES.632_4 45 

336034 CH22.FGENES578_5 65 

336441 CH22.FGENES.827.7 7.6 
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336624 


CH22JGENES.6-3 


43.3 


336625 


CH22 FGENES.6-4 


37.9 


336679 


CH22J=GENES.43-7 


5.3 


337577 


CH22_C65E1.GENSCAN.8-1 


4.9 


338255 


CH22_Bd:AC005500.GENSCAN.276-3 


13.4 


338260 


CH22„EM:AC005500.GENSCAN^79-10 


4.6 


338561 


CH22 EMAC005500.6ENSCAN.421-5 


4.6 


338562 


CH22 EM:AC0Q5500.GENSCAN.421-6 


4.3 


338759 


CH22 EM:AC005500.GENSCAN.517-6 


5.1 


338763 


CH22 EM:AC005500.GENSCAN.517-16 


5.5 


338764 


CH2^.EM^C005500.GENSCAN.517-17 


7.1 



133 



WO 02/30268 PCT/USO 1/32045 



TABLE 3A shows the accession numbers for those primekeys lacking unigenelD's for Table 
3. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 

10 Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 

15 Pkey CAT number Accession 

123619 371681J AA602964 AA609200 

116722 143512J Z24878 AM94098 F13654AA494040AA143127 

103677 41847J Z83806AJ132091 AJ1 32090 

20 125992 1589048 J H48372 W01626 

109342 genbank_AA213620 AA213620 

125154 genbanleW38419 W38419 

101447 entre*_M213G5 M21305 

124357 genbanK_N22401 N22401 
25 108910 genbankJ\A136590 AA136590 

322278 47271 1 W69304 AF086283 W69200 

315084 350959J AI821085 AW973464 AA554802 AI821831 AA657438 AA640756 AA650339 

324019 262792J AW177009 AI381610 

324330 300543 J AA884766 AW974271 AA592975 AA447312 

30 324626 33641 1_1 A1685464 AW971336 AA513587 AA525142 

303029 37699J AF1 99613 AF108756 

324804 398093J . AI692552 AI393343 AI800510 AI377711 F24263AA661876 

324961 376239J AA613792 AW182329 T05304AW858385 

329362 c_Os 
35 336624 CH22J071FG__6_3_ 

336625 CH22_4072FG__6_4_ 

336679 CH22_4157FG_43J7_ 

338255 CH22_6856FG_LINK_EM^C00 

338260 CH22_6863FG_UNieEM:AC00 
40 329929 c16_p2 

329960 c16_p2 

338561 CH22 7294FG_UN1CEM^C00 

338562 CH22l7295FG__LINK_EM:AC00 
338759 CH2£_7581FG_LINK_EM:AC00 

45 338763 CH22_7585FG_LINK_EM:AC00 
338764 Cr^2J586FQ_UNK.EMAC00 

333168 CH22J00FG_94J.UNK_EMA 

333169 CH22_40ire_94J_L1NieEM:A 
333452 CH22J02FGJ57J_LINK_EM: 

50 333456 CH22J06FG_157_5_UNK_EM: 
r 333458 CH2^708FGL157_7_UNK_EM: 

333611 CH22_872FGJ17_6JJNK-EM: 

333621 CH22_882FG_219_5_UNK_EM: 

333814 CH22 10*3FGL282_2JJNK_EM 
55 ' 333849 Cr^1118FG^_8JJNK_EM 

335179 CH22_2515FG_504_9JJNK_EM 

333949 CH22_1225FG,303_5_LINK_EM 

333951 CH22_1227FG_303J7J_INK_EM 

333955 CH22J231FG_303J1^UNK_E 
60 335293 CH22J2635FG_527_6_UNK_EM 

326816 C20JS 

326997 c21JS 

335550 CH22J905FG__576J1JJNK_E 
335581 CH22J2938FGJ81J9JJNK_E 
65 335586 CH22J944FG_581_25_LINK_E 
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328492 c_7_hs 

335809 CH22_3181FG_617_6_.UNK.EM 

335810 CH22_3182FG 617 7JUNK.EM 
335822 CH22.31 95FG_61 9_7_LIN K_EM 

5 335824 CH22_3197FG_619_11JJNK_E 
335853 CH22_3228F<L626_5_UNK_EM 
335886 CH22JE61F<L632jU-INK_EM 
330020 c16_p2 
330211 c_5 _p2 
10 337577 CH22_5864FG__LINK_C65E1.G 
307848 AI364186 

332797 CH22J3FGL6_2^UNieC4G1.G 

332798 CH22J4FG.6 5_UNK_C4G1.G 

332799 CH22_15FG_6_6_UNK_C4G1.G 
15 334150 CH22J429FG_339J_L1NK__EM 

332933 CH22J54FG_38_7_UNK_C20H 
332980 CH22J04FGJ54J_UNK_EM:A 
332984 CH22_208FG__54_6_UNK_EM:A 
334223 CH22_1507FG_360_4_LINK_EM 
20 334297 CH22_1588FG_372_3_LINK_EM 
327098 C21JIS 

334443 CH22J742FG_387_2J.fNK_EM 

334444 CH22_1743FG_387_4J-INK^EM 
334447 CH22_1746FG_387_7_UNK_EM 

25 334570 CH22 J875FG_405 J 1JJNK_E 
334749 CH22_2061 FG_427_1_LINK^EM 
334777 CH22_2089FG_430_9_LINK_EM 
336034 CH22_341 9FG_678_5_UNKJ)J 
334980 CH22_2281FG_4S5_29JJNK_E 
30 336441 CH22_3861FG_827_7JJNK_DJ 

330551 9851_2 U39840 NM_004496 AW1 35607 BE087458 BE087567 AA177116 AW195705 AW750756 AI811008 AI694151 

BE348594AW971075 A1347950 AI201455 AI073898 M652680AA613671 A1318364AA507550AA693692 
AI032599 AA991871 AI269801 AW948974 T74639 AA532907 AW949173 
330786 53973_3 BE379594 AI192455 AL039862 AI744012 AI761735 AW243181 AI743687 A1928223 AI423022 A1627855 

35 A1636059 A1651571 AW802044 AI826995 AI431733 AI539125 AA863056 AW270910 AI768930 AW008835 

AW615183 AW591147 A1695294 AI672106 AA506358 AI308060 AA011556 AA962437 AI935488 BE219625 
AI004356 AW151394 AI218466 N66178 AI419784 AW242519 AW946907 D60374 AA989263 AI698799 
AM70460A1824167 

332247 372969 1 AA669097AA513815 AAQ26798AA676526AA704429M704269AW1 18292 AA579216 N58172 

40 332396 20265 1 AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW367811 AW367798 

R17370AI908947 AA382932 R58449 H18732AA371231 AW962899 AA713530 AW892946 R53463H11063 
AW068542 Z40761 BE176212 8E176155 W23952 W92188 AW374883 AA303497 AW954769 AA036808 
BE168063 AW382073 AW382085 AL041475 H80748 AI078161 BE463983 A1805213 AI761264 W94885 
N94502 A1623772 AI419532 AI810302 AI634190 AW002516 AW150777 AI352312 AI367474 AW204807 
45 AI675502 A1337026 AW134715 BE328451 AI123157 AI560020 AI300745 AI608631 Ai248873 AA742484 

AW051635 H18646 AI245045 AA5071 1 1 AI64051 0 AI925594 AA1 15747 AA143035 AA151 106 
332781 32044 J AK001 764 BE313896 AA380199 AA380151 AA194996AW1 18089 AA495871 AW975219AW085598 

AI378909 AW992310 AW992409 AI911857 AA657643 A1804471 A1242589 A1623968 R09556 A1129100 
AI206500 AA680094 AA677784 AI023178 AI277519 AA424742 AI240654 AA232846 AI804273 AI382376 
50 AA001729 W90790 BE090656 AW295015 AI674596 A1431734 AI420517 AW769185 AI128355 AI192474 

AI820001 AA001929 AA706925 AKJ76676 AI4991 19 AI200493 AI695919 AI376217 W69195 W69261 
AW305099 W90320 BE048357 AI658856 AA838534 AA233258 AI753393 AA709227 AI674387 A1872616 
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TABLE 3B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 3. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. •Dunham I. et aL" refers to the 

publication entitled 'The ONA sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

333611 Dunham, I. etal. 

333621 Dunham, I. etal. 

333814 Dunham, I. etal. 

333849 Dunham, I. etal 

333949 Dunham, I. etal 

333951 Dunham, I. etal. 

333955 Dunham, I. etaL 

334150 Dunham, I. etal. 

334297 Dunham, I. etal. 

334443 Dunham, I. etal. 

334444 Dunham, I. etal. 
334447 Dunham, I. etal 
334570 Dunham, I. etal 
334777 Dunham, I. etal. 
335179 Dunham, I. etal. 
335581 Dunham, I. etal. 
335586 Dunham, I. etal. 

335809 Dunham, I. etal 

335810 Dunham, I. etal. 
335822 Dunham, I. etal. 
335824 Dunham, I. etal 
335886 Dunham, I. etal. 
336034 Dunham, I. etal 
336441 Dunham, I etal 
337577 Dunham, I. etal 
338260 Dunham, I etal 

332797 Dunham, I. etal 

332798 Dunham, I. etal 

332799 Dunham, !. etal. 
332933 Dunham, I. etal. 
332980 Dunham, I. etal. 
332984 Dunham, I. etal. 

333168 Dunham, I etal 

333169 Dunham, I. etal. 
333452 Dunham, I. etal. 
333456 Dunham, I. etal. 
333458 Dunham, I. etal 
334223 Dunham, I. etal. 
334749 Dunham, I. etal 
334960 Dunham, I. etal. 
335293 Dunham, I. etal. 
335550 Dunham, I etal. 
335853 Dunham, I. etal 

336624 Dunham, I. etal. 

336625 Dunham, I. etal. 
336679 Dunham, I. etal 
338255 Dunham, I. etal 

338561 Dunham, I. etal. 

338562 Dunham, I. etal 
338759 Dunham, I. etal 

338763 Dunham, I. etal. 

338764 Dunham, I. etal. 



Strand NLposition 

Plus 6548368-6548507 

Plus 8597414-8597560 

Pius 7894165-7894252 

PIUS 8018323-8018472 

Plus 8589634-8589791 

Plus 8592501-8592637 

Plus B597414-8597560 

Plus 1052922M0529854 

Plus 13420934-13421058 

PIUS 14298981-14299056 

Pius 14306433-14306492 

Plus 14308764-14308824 

Plus 14994868-14994943 

Plus 16259586-16260166 

Plus 21634405-21634526 

Plus 24976198-24976334 

Plus 24990333-24990497 

Plus 26310772-26310909 

PIUS 26314767-26314849 

Plus 26364087-26364196 

Pius 26376860-26376942 

Plus 26934235-26934364 

Plus 29014404-29014590 

Plus 34187606-34187663 

Plus 595377-595678 

PIUS 15458919-15459257 

Minus 216964-216798 

Minus 232147-231974 

Minus 232421-232307 

Minus 2035790-2035681 

Minus 5136165-5136019 

Minus 2632606-2632457 

Minus 3729896-3729788 

Minus 3730864*730767 

Minus 513616^5136019 

Minus 2631933-2631797 

Minus 5143942-5143806 

Minus 12734365-12734269 

Minus 16090686-16090106 

Minus 20160968-20160795 

Minus 22316408-22316275 

Minus 24668714-24668658 

Minus 26614629-26614506 

Minus 227714-227577 

Minus 229124-229024 

Minus 2035790-2035681 

Minus 15242294-15242231 

Minus 22311966-22311856 

Minus 22312594-22312465 

Minus 26582475-26582199 

Minus 26628148-26628009 

Minus 26641232-26641101 
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329960 5091594 

329929 6165201 

330020 6671887 

326816 6552458 

326997 5867660 

327098 6682516 

330211 6013592 

328492 5868455 

329362 5866837 



Minus 103M162 

Minus 156410-156553 

Plus 172397-172491 

Plus 198354-198436 

Minus 71389-72147 

Minus 1061684-1062361 

PIUS 59158-59215 

Minus 46094-46241 

Minus 65688-68173 
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TABLE 4: shows a preferred subset of the Accession numbers for genes found in Table 3 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Ratio of tumor to normal body tissue 



Pkey 


ExAccn 


UnigenelD Unigene Title 


R1 


100819 


HG4020-HT4290Hs2387 Transglutaminase 


105 


102698 


U75272 


Hs.1867 progastricsln (pepsinogen C) 


10.6 


102869 


XQ2544 


Hs572 orosomucoid 1 


22.6 


105370 


AA236476 


Hs22791 ESTs; Weakly similar to transmembrane pr 


10.3 


105645 


AA282138 


Hs.11325 ESTs 


14 


106094 


AM19461 


Hs23317 ESTs 


103 


109014 


AA156790 


Hs262036 ESTs 


15.3 


109562 


F01811 


Hs.187931 ESTs; Moderately similar to voltage-gate 


103 


113021 


T23855 


Hs.129836 K1AA1028 protein 


103 


114124 


Z38595 


Hs.125019 ESTs; Highly similar to KIAA0886 protein 


213 


122791 


AA460158 


Hs.129836 K1AA1028 protein 


12.4 


124352 


N21626 


Hs.102406 ESTs 


102 


301042 


AI659131 


Hs.197733 ESTs 


243 


302005 


AI869666 


Hs.123119 ESTs 


363 


302410 


NMJJ04917 


Hs218366 EST cluster (not in UniGene) with exon h 


263 


302881 


AA508353 


Hs.105314 relaxin 1 (H1) 


78.8 


303344 


AA255977 


Hs25Q646 ESTs; Highly similar to ubiquitin-conjug 


193 


303753 


AW503733 


Hs3414 ESTs 


13 


310431 


AI420227 


Hs.149358 ESTs 


723 


311251 


AI655662 


Hs.197698 ESTs 


413 


311596 


AI682088 


Hs.79375 ESTs 


264 


312153 


AA759250 


Hs.118625 cytochrome b-561 


11 


312521 


AA033609 


Hs239884 ESTs 


112 


313676 


AA861697 


Hs.120591 EST cluster (not in UniGene) 


13.4 


314171 


AI821895 


Hs.193481 ESTs 


29.4 


314907 


AI672225 


H&222886 ESTs 


19.3 


315051 


AW292425 


Hs.163484 EST 


153 


315052 


AA876910 


Hs.134427 ESTs 


20 


317548 


AI654167 


Hs.195704 ESTs 


142 


317869 


AW295184 


Hs.129142 ESTs; Weakly similar to OEOXYRIBONUCLEAS 13.8 


318428 


AI949409 


Hs.194591 ESTS 


123 


318524 


AW291511 


Hs.159066 ESTs 


253 


319080 


Z45131 


Hs23023 ESTs 


163 


319763 


AA460775 


Hsj6295 ESTs 


143 


320324 


AF071202 


Hs.139336 ATP-binding cassette; sub-family C (CFTR 


562 


321441 


AW297633 


Hs.118498 ESTs 


14.7 


322303 


W07459 


Hs.157601 EST cluster (not in UniGene) 


22 


322782 


AA056060 


Hs202577 EST cluster (not in UniGene) 


18.4 


322818 


AW043782 


Hs293616 ESTs 


10.7 


323287 


AA639902 


Hs.104215 ESTs 


24.7 


324603 


AW016378 


Hs292934 ESTs 


242 


324617 


AA508552 


Hs.195839 ESTs 


54 


324658 


AI694767 


Hs.129179 ESTs 


22 


324691 


AI217963 


Hs293341 ESTs; Weakly similar to Pro-a2(XI) [H.sa 


103 


324696 


AA641092 


HS257339 ESTs 


102 


324718 


AI557019 


Hs.116467 ESTs 


34.4 


330211 




CH.05_p2gi|6013592 


12.6 


330430 


HG2261-HT2352 Hs.321 1 10 Antigen, Prostate Specific, AIL Splice 


133 


330706 


AA1 21 140 


Hs.177576 ESTs; Moderately similar to kynurenine a 


145 


330762 


AA449677 


Hs.15251 Human DNA sequence from clone 437M21 on 18.5 


330892 


AA149579 


Hs.91202 ESTs 


153 


330949 


H01458 


Hs.142896 ESTs 


103 
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331099 R36671 Hs.14846 ESTs 11.6 

331151 R82331 Hs.268838 ESTs 13 

331889 AA431407 Hs.98802 Homo sapiens Chromosome 16 BAC done CIT 33.6 

332247 N58172 ESTs 14.2 

332396 AA34O504 ESTs; Weakly similar to slmilarto human 21 2 

332533 M99487 Hs.325825 folate hydrolase (prostate-specific memb 38.1 

332697 TS4885 Hs.75725 carboxypeptidase E 24.3 

332797 CH22_FGENES.6J 30.8 

332798 CH22_FGENES.6_5 66.8 

332799 CH22_FGENES.6_6 19.8 
334223 CH22J=GENES.360 4 20.3 

336624 CH22^FGENES.6-3 43.3 

336625 CH22_FGENES.6-4 37.9 
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TABLE 4A shows the accession numbers for those primekeys lacking unigenelD's for Table 
4. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



1 0 Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



15 Pkey CAT number 



20 



25 



30 



336624 CH22_4071FG_6_3_ 

336625 CH22_4072FG_6_4_ 
330211 C_5ji2 

332797 CH22J3FG_6 2JJNKJC4G1.G 

332798 CH2^14FG_6_5JJNK_C4G1.G 

332799 CH22_15FG_6_6_LINKJ>JG1.G 
334223 CH22_1507FG_360_4_LINK_EM 
332247 372969J 

332396 2Q265J 



Accession 



AA669097 AA513815 AA026798 AA676526 AA704429 AA704269 AW1 18292 AA579216 N58172 
AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW367611 
AW367798 R17370 AI908947 AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 
R53463 H1 1063 AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374883 AA303497 
AW954769 AA036808 BE168063 AW382073 AW382085 AL041476 H80748 AI078161 BE463983 
AI805213 AI761264 W94885 N94502 AI623772 AI419532 AI810302 AI634190 AW002516 AW150777 
AI352312 AI367474 AW204807 AI6755Q2 AI337026 AW134715 BE328451 AI123157 AJ560020 
AI300745 AI608631 AI248873 AA742484 AW051635 H18646 AI245045 AA507111 AI640510 AI925594 
AA115747 AA143035 AA151 106 
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TABLE 4B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 4. For each predicted exon, we have listed the genomic sequence 
5 source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham L et aT refers to the publication entitled The 

10 DNA sequence of human chromosome 22." Dunham I. et a!., Nature (1999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



15 



20 



Pkey 


Ref 


Strand 


NLposition 


332797 


Dunham, 1. etaL 


Minus 


216964-216798 


332798 


Dunham, t. etaL 


Minus 


232147-231974 


332799 


Dunham, 1. etaL 


Minus 


232421-232307 


334223 


Dunham, 1. eLal. 


Minus 


12734365-12734269 


336624 


Dunham, 1. etal. 


Minus 


227714-227577 


336625 


Dunham, 1. etal. 


Minus 


229124-229024 


330211 


6013592 


Pius 


. 59158-59215 
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10 



TABLE 5: 1170 GENES UP-REGULATED IN PROSTATE CANCER COMPARED TO 

NORMAL ADULT TISSUES 

Table 5 shows 1170 genes up-regulated in prostate cancer compared to normal adult tissues. 
These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip array such 
that the ratio of "average" prostate cancer to "average" normal adult tissues was greater than 
or equal to 3.44. The "average" prostate cancer level was set to the 85 th percentile amongst 73 
prostate cancers. The "average" normal adult tissue level was set to the 85 th percentile 
amongst 162 non-malignant tissues. In order to remove gene-specific background levels of 
non-specific hybridization, the 7.5 th percentile value amongst the 162 non-malignant tissues 
was subtracted from both the numerator and the denominator before the ratio was evaluated. 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey: 




Unique Eos probeset identifier number 




ExAocn: 




Exemplar Accession number, Genbank accession number 




UnigenelD: 




Unigene number 






Unigene Title: 


Unigene gene title 


i 




R1: 




Ratio of tumor to normal tissue 




Pkey 


ExAccn 


UnigenelD 


1 InUitim Till n 

unigene Tiue 


R1 


446057 


AI420227 


Hs.149358 


toi s, weawy similar 10 ahouiu A-iinKsa 


86.42 


400302 


N48056 


Hs.1915 


ioiate nyaroiase uprosraie-specHic memo 


66.46 


414569 


AF109298 


Hs.1 18258 


nrncfqfa Mn/Mr ^ c^crvrvof arl r\m tain 1 

prosiaie cancer assoctaieu protein i 


58.36 


417407 


AA923278 




F9T<i Wfiakh/ similar to nntteaw fH.sanl 


56.16 


431579 


AW971082 


HS-222886 


ESTs, Weakly similar to TRHYJWMAN TRICH 


5338 


409361 


NMJJQ5982 H&54416 


sine oculis homeobox (Drosophila) homolo 


48.28 


409731 


AA125985 


Hs.56145 


thymosin, beta, identified in neuroblast 


45.24 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


43.48 


420154 


AI093155 


Hs.95420 


JM27 protein 


41.12 


433466 


AA508353 


Hs.105314 


retaxin1(H1) 


39.88 


400296 


AA305627 


Hs.139336 


ATP-binding cassette, sub-family C (CFTR 


38.42 


400292 


AA250737 


Hs72472 


ESTs 


38.00 


432887 


A1926047 


Hs.1 62859 


ESTs 


36.48 


439176 


AI446444 


Hs.190394 


ESTs, Weakly similar to B28096 line-1 pr 


36.45 


430722 


AW968543 


H&203270 


ESTs, Weakly similar to ALU1.HUMAN ALU S 


33.20 


437052 


AA861697 


Hs.120591 


ESTs 


33.02 


418396 


AI765805 


Hs56691 


ESTs 


32.68 


434036 


AC659131 


Hs.197733 


hypothetical protein MGC2849 


32.44 


407709 


AA456135 


Hs.23023 


ESTs 


32.10 


426747 


AA535210 


Hs.171995 


kaliikrein 3, (prostate specific antigen 


31.60 


407168 


R45175 




ESTs 


31.72 


440260 


A1972867 


Hs.7130 


copine IV 


3052 


421513 


X00949 


Hs.105314 


relaxin 1 (H1) 


30.10 


416370 


N90470 


Hs.203697 


ESTs, Weakly similar to 138022 hypoM 


29.68 


407122 


H20276 


Hs.31742 


ESTs 


2954 


400287 


S39329 


Hs.181350 


kaliikrein 2, prostatic 


28.90 


432244 


AI669973 


Hs.200574 


ESTs 


28.74 


451939 


U80456 


Hs.27311 


single-minded (DrosophHa) homoiog 2 


28.74 


415989 


AI267700 


Hs.111126 


ESTs 


28.34 


418961 


AW967646 


Hs.23023 


ESTs 


2734 


425628 


NMJW4476 Hs.1915 


folate hydrolase (prostate-specific memb 


2732 


458509 


AA654650 


Hs.282906 


ESTs 


2754 


448290 


AK002107 


Hs.20843 


Homo sapiens cDNA FU11245 Rs, clone PL 


27.16 


428336 


AA503115 


Hs.183752 


microsemincprotein, beta- 


26.17 


450096 


AI682088 


Hs.223368 


holocarooxylase synthetase (biotin-(prop 


25.60 


400299 


X07730 


Hs.171995 


kaliikrein 3, (prostate specific antigen 


24.91 


437571 


AA760894 


Hs.153023 


ESTs 


24.74 


453160 


AI263307 


Hs.146228 


H2B hlstone family, member L 


24.66 


453096 


AW294631 


Hs.11325 


ESTs 


24.46 


425075 


AA506324 


Hs.1852 


acid phosphatase, prostate 


2453 


407202 


N58172 


Hs.109370 


ESTs 


24.18 
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424846 AU077324 Hs.1832 neuropeptide Y 2357 

453370 AI470523 Hs.182356 ATP-binding cassette, sub-family C(CFTR 23.16 

422805 AM36989 Hs.121017 H2A htstone family, member A 2252 

444917 R68651 Hs.144997 ESTs 2226 

5 408826 AF216077 Hs.48376 Homo sapiens done H6-2 mRNA sequence 22.02 

413597 AW302885 Hs.1 17183 ESTs 21.76 

426429 X73114 Hs.1 69849 myosin-binding protein C, slow-type 2132 

435981 H74319 Hs.188620 ESTs 21.12 

432966 AA650114 ESTs 21.07 

10 418848 AI820961 Hs.193465 ESTs 21.06 

405685 2050 

443271 BE568568 Hs.195704 ESTs 1958 

418819 AA228776 Hs.191721 ESTs 1954 

420757 X78592 Hs59915 androgen receptor (dihydrotestosterone r 19.72 

15 418994 AA296520 Hs59546 selectin E (endothelial adhesion molecul 1956 

429918 AW873986 Hs. 119383 ESTs 19.04 

415539 AI733881 Hs.72472 ESTs 18.43 

450382 AA397658 Hs.60257 Homo sapiens cONA FU13598 fis, clone PL 18.34 

418829 AA516531 Hs.55999 NK homeobox (Drosophila), family 3, A 1828 

20 429984 AL0501Q2 Hs227209 hypothetical protein FU21 617 1752 

443822 AI087412 Hs.143611 ESTs, Weakly similar to 2004399A chromos 17.66 

431676 AI685464 Hs292638 gb1t88f04j(1 NCI_CGAP^Pr28 Homo sapiens 17.64 

410330 AW023630 Hs.46786 ESTs 1752 

432441 AW292425 Hs.163484 ESTs 1741 

25 452792 AB037765 Hs.30652 KIAA1 344 protein 1759 

445472 AB006631 Hs.12784 Homo sapiens mRNA for KIAA0293 gene, par 17.00 

414565 AA502972 Hs.183390 hypothetical protein RJ13590 1652 

430487 D87742 Hs241552 KIAA0268 protein 16.72 

431716 D89053 Hs268012 fatty-add-Coenzyme A Ggase, long-chain 1650 

30 419536 AA603305 gb:np12d1U1 NCLCGAP_Pr3 Homo sapiens 1650 

439677 R82331 Hs.164599 ESTs 16.46 

449625 NMJM253 Hs23796 odz (odd Ozfterwn, Drosophila) homotog 1 1652 

408430 S79876 Hs.44926 dipeptidylpeptidase IV (CD26, adenosine 1628 

447033 AI357412 Hs.157601 ESTs 16.02 

35 453006 AI362575 Hs.167133 ESTs 15.74 

431474 AL133990 Hs.190642 ESTs 15.70 

420218 AW958037 Hs22437 ribosomal protein L4 15.64 

408000 L11690 Hs.620 bullous pemphigoid antigen 1 (230/240kD) 1554 

416208 AW291168 Hs.41295 ESTs, WeaMy similar to MUC2.HUMAN MUCIN 15.48 

40 430226 BE245562 Hs2551 adrenergic, beta-2-, receptor, surface 15.40 

415263 AA948033 Hs.130853 ESTs 15.38 

432437 W07088 Hs293685 ESTs 1526 

428398 AI249368 Hs.98558 ESTs 1521 

429900 AA460421 HS50875 ESTs 1450 

45 449156 AF103907 Hs.171353 prostate cancer antigen 3 1459 

411096 U80034 Hs.68583 mitochondrial intermediate peptidase 1451 

435974 • U29690 Hs57744 Homo sapiens beta-1 adrenergic receptor 14.76 

444484 AK002126 Hs.11260 hypothetical protein RJ1 1264 14.76 

422728 AW937826 Hs.103262 ESTs, Weakly similar to ZN91_HUMAN ZINC 14.60 

50 418601 AA279490 Hs.86368 calmegln 1456 

448999 AF1 79274 Hs22791 transmembrane protein with EGF-fike and 1455 

445885 AI734009 Hs.127699 K1AA1 603 protein 14.44 

452712 AW838616 gb:RC5-LT0054-14020OO13-D01 LT0054Homo* 1422 

432189 AA527941 gb:nh30c04.s1 NCI_CGAP_Pr3 Homo sapiens 14.12 

55 424565 AW102723 Hs.75295 guanylate cyclase 1, soluble, afoha 3 13.78 

429290 AF203032 Hs.198760 neurofilament heavy polypeptide (200kD) 1357 

419264 AA877104 Hs293672 ESTs, Weakly similar to ALUBJJUMAN III! 13.40 

416445 AL043004 Hs.300678 KIAA01 35 protein 13.32 

407275 AI364186 gb^w34h07j(1 NCLCGAPJJ14 Homo sapiens 1324 

60 408369 R38438 Hs.182575 solute carrier family 15 (H^eptide tra 1321 

446720 AI439136 Hs.140546 ESTs 13.06 

434988 AI418055 Hs.161160 ESTs 13.02 

448172 N75276 Hs.135904 ESTs 1258 

416182 NNL004354 Hs.79069 cycfinG2 12.94 

65 420544 AA677577 Hs.98732 Homo sapiens Chromosome 16 BAC clone CIT 12.79 

445413 AA151342 Hs.12677 CGI-1 47 protein 12.64 

452588 AA889120 Hs.110637 homeoboxA10 12.62 

407819 R42185 Hs274803 ESTs 1250 

433444 AW975324 Hs.129816 ESTs 1250 
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421059 


AI654133 


Hs.30212 


thyroid receptor interacting protein 15 


12-30 


420077 


AW512260 


Hs.87767 


ESTs 


12.24 


453930 


AM19466 


Hs.36727 


hypothetical protein RJ10903 


12.22 


441610 


AW576148 


Hs.148376 


ESTs 


12.20 


451009 


AA013140 


Hs.115707 


ESTs 


12.18 


433764 


AW753676 


Hs.39982 


ESTs 


12.16 


440266 


U29589 


Hs.7138 


cholinergic receptor, muscarinic 3 


12.04 


443912 


R37257 


Hs.184780 


ESTs 


11.92 


419526 


AI821895 


Hs.193481 


ESTs 


11.91 


423073 


BE252922 


Hs.123119 


MAD (mothers against decapentaptegic, Dr 


11.87 


452784 


BE463857 


Hs.151258 


hypothetical protein FU21062 


11.86 


414422 


AA1 47224 


Hs.71814 


ESTs 


11.76 


450203 


AF097994 


Hs.301526 


L-kynurenine/alpha-aminoadipate aminotra 


11.68 


436679 


AI127483 


Hs.120451 


ESTs, Weakly similar to unnamed protein 


11.60 


440901 


AA909358 


Hs.128612 


ESTs 


11.60 


448045 


AJ297436 


H$2om 


prostate stem cell antigen 


11.51 


433887 


AW204232 


Hs279522 


ESTs 


11j50 


434980 


AW770553 


HS293640 


sterol O-acyitrartsferase (acyi-Coenzyme 


11.38 


425905 


AB032959 


Hs.161700 


novel C3HC4 type Zinc finger (ring finge 


11.33 


434680 


T11738 


Hs.127574 


ESTs 


11.32 


449650 


AF055575 


Hs^97647 


calcium channel voltage-dependent L ty 


11.18 


431173 


AW971198 


Hs.294063 


ESTs 


11.16 


434539 


AW748078 


Hs.214410 


ESTs, Weakry similar to MUC2JWMAN MUCIN 


11.16 


410037 


AB020725 


HS58009 


KIAA0918 protein 


11.14 


417708 


N74392 


Hs50495 


ESTs 


11.14 


458332 


AI000341 


Hs220491 


ESTs 


11.12 


420381 


D50640 


Hs.301782 


phosphodiesterase 3B t cGMP-Inhibited 


11.10 


425665 


AK001050 


Hs.159066 


hypothetical protein FU10188 


11.08 


425710 


AF030880 


Hs.159275 


solute carrier family, member 4 


11.08 


428728 


NM_016625 


Hs.191381 


hypothetical protein 


11.04 


407021 


U52077 




gb:Human marinerl transposase gene, comp 


11.02 


410733 


D84284 


Hs.66052 


CD38 antigen (p45) 


11.02 


401714 








10.90 


434485 


AI623511 


Hs.1 18567 


ESTs 


10.89 


415786 


AW419196 


Hs^57924 


hypothetical protein FU13782 


10.87 


452340 


NM 002202 


Hs5Q5 


ISL1 transcription factor, UM/homeodoma 


10.85 


453628 


AW243307 


Hs.1 70187 


hypothetical protein 


10.72 


408063 


BE086548 


Hs.42346 


caldneurin-binding protein caisarcin-1 


10.67 


417687 


A1828596 


Hs.250691 


ESTs 


10X4 


434666 


AF151103 


Hs.1 12259 


T ceil receptor gamma locus 


1053 


432374 


W68815 


Hs.301885 


Homo sapiens cONA FU11 346 fts, clone PL 


10.50 


428819 


AL135623 


Hs.193914 


KIAA0575 gene product 


10.48 


413409 


AI638418 


Hs.21745 


DEAD/H (Asp-Glu-Ala-Asp/His) box pdypep 


10.44 


428775 


AA434579 


Hs.143691 


ESTs 


10.21 


436556 


AI364997 


Hs.7572 


ESTs 


10.20 


441690 


R81733 


Hs.33106 


ESTs 


10.14 


419852 


AW503756 


Hs.286184 


hypothetical protein dJ551 D25 


10.10 


421991 


NM 014918 


Hs.1 10488 


KIAA0990 protein 


10.04 


423698 


AA329796 


Hs.1098 


DKFZp434J1813 protein 


10.02 


452039 


A1922988 


Hs.172510 


ESTs 


10.00 


433043 


W57554 


Hs.125019 


ESTs 


9.98 


433927 


AI557019 


Hs.1 16467 


small nuclear protein PRAC 


9.97 


445424 


AB028945 


Hs.12696 


cortactin SH3 domain-binding protein 


9.96 


432240 


AI694767 


Hs.129179 


Homo sapiens cDNA FU 13581 fis, clone PL 


9.88 


433104 


AL043002 


Hs.128246 


ESTs, Moderately similar to unnamed prot 


9.84 


452744 


AI267652 


Hs.30504 


Homo sapiens mRNA; cDN A DKFZp434E082 (fr 


9.82 


431217 


NMJJ13427 


HS250830 


Rho GTPase activating protein 6 


9.75 


427398 


AW390Q20 


H&20415 


chromosome 21 open reading frame 1 1 


9.70 


446896 


T15767 


Hs.22452 


Homo sapiens mRNA for KIAA1737 protein, 


9.70 


421470 


R27496 


Hs.1378 


annexinA3 


9.64 


406554 








9.60 


401424 








958 


407902 


AL1 17474 


Hs>H181 


Homo sapiens mRNA; cDNA DKFZp727C191 (fr 


956 


423545 


AP000692 


Hs.129781 


chromosome 21 open reading frame 5 


9.54 


439024 


R96696 


Hs.35598 


ESTs 


951 


431548 


AI834273 


HSS711 


novel protein 


9.48 


409262 


AK000631 


Hs52256 


hypothetical protein FU20624 


9.45 


446271 


D82484 


Hs.100469 


ESTs 


9.42 


448692 


AW013907 


Hs.224276 


memyk^tonoyWoenzyme A carboxylase 2 


926 
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437718 
439820 
447342 
446223 
410001 
424012 
441791 
448206 
414269 
442081 
420092 
411630 
421863 
454141 
418278 
428330 
432415 
424906 
415245 
442409 
404571 
418033 
456497 
405876 
448807 
445372 
425171 
419968 
407385 
433172 
422631 
412719 
418849 
444922 
427674 
432101 
416268 
404915 
440106 
442861 
452259 
443250 
437267 
452891 
422219 
453049 
439731 
408554 
421154 
430107 
433404 
450813 
416239 
448212 
449532 
413930 
458191 
444858 
457498 
407235 
433759 
433805 



AA281279 Hs.23317 hypothetical protein RJ14681 

AF274571 Hs.129142 deoxyribonuclease II beta 

AW582962 Hs.300961 CGI-47 protein 

AA761526 Hs.163853 ESTs 

AW188551 Hs.99519 hypothetical protein FU14007 

BE182082 Hs.246973 ESTs 

AF086534 Hs.187561 ESTs, Moderately similar to ALU1_HUMAN A 

AI927288 Hs.196779 ESTs 

AL360204 Hs283853 Homo sapiens rnRNA full length insert cON 

AI199268 Hs.19322 Homo sapiens, Similar to RIKEN CDNA2010 

BE300091 Hs.119699 hypothetical protein FU12969 

AB041036 Hs.57771 kallikrein 1 1 

AW368377 Hs.137569 tumor protein 63 kOa with strong homolog 

AW372449 Hs.175982 hypothetical protein FU21 159 

BE622585 Hs.3731 ESTs, Moderately similar to J38022 hypot 

AA298489 olfactory receptor, family 51 , subfamily 

AA401863 H&22380 ESTs 

AA814043 Hs.88045 ESTs 

U42349 Hs.71 1 1 9 Putative prostate cancer tumor suppresso 

AI952677 Hs.108972 Homo sapiens rnRNA; cDNA DKFZp434P228 (fr 

AW138413 Hs.182356 ATP-blnding cassette, sub-family C (CFTR 

AI088489 Hs.83937 hypothetical protein 

L22524 Hs.2256 matrix metalloproteinase 7 (matrilysin, 

T16971 Hs289014 EST s, Weakly similar to A43932 mucin 2 p 

AI566086 Hs.153716 Homo sapiens rnRNA for Hmob33 protein, 3* 

N59650 HS27252 ESTs 

BE208843 Hs.129544 hypothetical protein MGC15438 

W68180 Hs.259855 elongation factor-2 kinase 

AW967956 Hs.123648 ESTs, Weakly similar to AF1 08460 1 ubinu 

A1571940 HsJ549 ESTs 

N36417 Hs.144928 ESTs 

AW732240 Hs.300615 ESTs 

X04430 Hs.93913 interieukin 6 pnterferon, beta 2) 

AA610150 Hs.272072 ESTs, Weakly similar to 138022 hypotheti 

AB037841 Hs.102652 hypothetical protein ASH1 

BE218919 Hs.1 18793 hypothetical protein FU10688 

AW016610 Hs.129911 ESTs 

AW474547 Hs.53565 Homo sapiens PIG-M rnRNA for mannosyrtran 

AI921750 Hs.1 44871 Homo sapiens cONA RJ13752 fis, clone PL 

NM.003528 Hs.2178 H2B hlstone family, member Q 

AI918950 Hs.11092 EphA3 

H51299 gb:yp07cQ6.s1 Soares breast 3NbHBst Homo 

AA864968 Hs.127699 KIAA1603 protein 

AA243837 Hs.57787 ESTs 

AA317439 Hs.28707 signal sequence receptor, gamma (translo 

AI041530 Hs.132107 ESTs 

AW511443 HS258110 ESTs 

N75582 H&212875 ESTs, Weakly similar to DYH9_HUMAN CiU 

AW976073 regulator of mitotic splndie assembly 1 

BE537217 Hs.30343 ESTs 

AI9531 35 Hs.45140 hypothetical protein FU14084 

AA836381 Hs.7323 nuclear receptor oo-repressor/HDAC3 comp 

AA284333 Hs.287631 Homo sapiens cDNA FU14269 fis, clone PL 

AM65293 Hs.105069 ESTs 

T32982 Hs.102720 ESTs 

AI739625 Hs.203376 ESTs 

AL038450 Hs.48948 ESTs 

AI475858 gb:tc87d07.x1 NCLCGAP_CLL1 Homo sapiens 

W74653 Hs.271593 ESTs, Moderately similar to A47582 B-ce! 

M86153 Hs.75618 RAB11A, member RAS oncogene family 

A1420611 Hs.127832 ESTs 

AI199738 Hs.208275 ESTs, Weakly similar to ALUAJWMAN Illl 

A1732230 Hs.191737 ESTs 

D20569 Hs.169407 SAC2 (suppressor of actin mutations 2, y 

AA680003 Hs.109363 Homo sapiens cDNA: FU23603 fis, clone L 

AA706910 Hs.112742 ESTs 
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426485 


Nl\4_006207 


Hs.1 70040 


platelet-derived growth factor receptor- 


7.72 




446028 


R44714 


Hs.1 06795 


Homo sapiens cDNA FU13136 fis, clone NT 


7.72 




418555 


A1417215 


Hs£7159 


hypothetical protein RJ12577 


7.70 




447499 


AW262580 


Hs.1 47674 


protocadherin beta 16 


7.70 


5 


419839 


U24577 


Hs.93304 


phosphoiipase A2, group VII (platelet-ac 


7.68 




416857 


AA188775 


H&292453 


ESTs 


7.68 




413801 


M62246 


Hs.35406 


ESTs, Highly slmflar to unnamed protein 


7.66 




425480 


AB023198 


Hs.158135 


KIAA0981 protein 


7.66 


10 


420120 


AL049610 


Hs.95243 


transcription elongation factor A (Sll)- 


7.64 


424099 


AF071202 


Hs.139336 


ATP-bindlng cassette, sub-family C (CFTR 


7.64 




446307 


T50083 


Hs.9094 


ESTs 


7.63 




429220 


AW207206 


Hs.136319 


ESTs 


7.59 




420345 


AW295230 


Hs.25231 


ESTs 


7.54 




429208 


AA447990 


Hs.190478 


ESTs 


754 


15 


447247 


AW369351 


Hs287955 


Homo sapiens cDNA FU13090 fis, clone NT 


7.53 




440995 


T57773 


Hs.10263 


ESTs 


7JS3 




448706 


AW291095 


Hs.21814 


interleukln 20 receptor, alpha 


7.52 




410227 


AB009284 


Hs.61152 


exostoses (multiple)-like 2 


7.49 


20 


431616 


AA508552 


Hs.195839 


ESTs, Weakly similar to I38Q22 hypotheti 


7.46 


434217 


AW014795 


Hs.23349 


ESTs 


7.44 




431467 


N71831 


Hs.256398 


Homo sapiens mRNA; cDNA DKFZp434E0528 (f 


7.42 




448519 


AW175665 


Hs244334 


Homo sapiens prostein mRNA, complete cds 


7.42 




446791 


AI632278 


Hs.34981 


ESTs 


7.40 


25 


419743 


AW408762 


Hs.127478 


Homo sapiens clone 24416 mRNA sequence 


7.39 


445855 


BE247129 


Hs.145569 


ESTs 


7.36 




425211 


M18667 


Hs.1867 


progastricsin (pepsinogen C) 


7.35 




419131 


AA406293 


Hs.301622 


ESTs 


7.34 




400294 


N95796 


Hs.179809 


Homo sapiens prostein mRNA, complete cds 


7.33 




441736 


AW292779 


Hs.169799 


ESTs 


7.28 


30 


427701 


AM11101 


Hs.221750 


nuclear autoantigenlc sperm protein (his 


72A 




457733 


AW974812 


Hs291971 


ESTs 


72A 




418432 


M14156 


Hs.85112 


insulin-Oke growth factor 1 (somatomedi 


722 




441201 


AW1 18822 


Hs.128757 


ESTs 


72\ 


35 


419953 


BE267154 


Hs.125752 


ESTs 


750 


419991 


AJ000098 


Hs.94210 


eyes absent (Drosophila) homolog 1 


720 




425018 


BE245277 


Hs.154196 


E4F transcription factor 1 


720 




424560 


M158727 


Hs.150555 


protein predicted by clone 23733 


7.18 




435380 


AA679001 


Hs.192221 


ESTs 


7.14 


40 


420658 


AW965215 


Hs.130707 


ESTs 


7.12 


408291 


AB023191 


Hs.44131 


KIAA0974 protein 


7.10 




409110 


AA191493 


HS.48778 


niban protein 


7.10 




414485 


W27026 


Hs.182625 


VAMP (veside-associated membrane protei 


7.10 




430039 


BE253012 


Hs.1 53400 


ESTs, Weakly similar to ALUIJiUMAN ALU S 


7.10 




450832 


AW970602 


Hs.105421 


ESTs 


7.10 


45 


417153 


X57010 


Hs.81343 


collagen, type II, alpha 1 (primary oste 


7.08 




412446 


AI768015 


Hs.92127 


ESTs 


7.07 




412953 


Z45794 


Hs.238809 


ESTs 


7.06 




418051 


AW1 92535 


Hs.19479 


ESTs 


7.06 


50 


421566 


NMJXJQ399 


Hs.1395 


early growth response 2 (Krox-20 (Drosop 


7.04 


446999 


AA151520 


Hs.279525 


hypothetical protein MGC4485 


7.04 




440529 


AW207640 


Hs.16478 


Homo sapiens cDNA: FU21718 lis, clone C 


7.04 




441111 


AI806867 


Hs.126594 


ESTs 


7.01 




451027 


AW519204 


Hs.40808 


ESTs 


7.00 


55 


408432 


AW195262 




gb:xn67b05.x1 NCLCGAP_CML1 Homo sapiens 


7.00 


432223 


AA333283 


Hs.285336 


Homo sapiens, clone IMAGE3460280, mRNA 


7.00 




444805 


AB007899 


Hs.12017 


homolog of yeast ublquitm-protein Ggas 


6.99 




414212 


M136569 


HS295940 


KIAA0187 gene product 


6.98 




431725 


X65724 


Hs.2839 


Nome disease (pseudoglioma) 


fi OA 

Coo 


60 


449685 


AW296669 


Hs.66095 


ESTs 


6.97 


447313 


U92981 


Hs.18081 


Homo sapiens clone DT1P1B6 mRNA, CAG rep 


6.96 




424590 


AW966399 


Hs.46821 


hypothetical protein RJ20086 


6.94 




449655 


AI021987 


Hs.59970 


ESTs 


6.92 




419563 


AA526235 


Hs.193162 


Homo sapiens cDNA FU11983 fis, clone HE 


6.90 


65 


434163 


AW974720 


Hs.25206 


group XII secreted phosphoiipase A2 


6.89 


415809 


Z32789 


Hs.46601 


ESTs 


6.86 


425782 


U66468 


Hs.159525 


cell growth regulatory with EF-hand doma 


6.85 




417958 


AA767382 


Hs.193417 


ESTs 


6.84 




427408 


AA583206 


H&2156 


RAR-related orphan receptor A 


6.79 




445873 


AA250970 


HS251946 


poly(AH)lnding protein, cytoplasmic 1-1 


6.74 
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410718 


AI920783 


Hs.191435 


ESTs 


6.74 




432363 


AA534489 




gb:nf76g11.s1 NCLCGAP_Co3 Homo sapiens 


6.74 




436521 


AW203986 


Hs5130Q3 


ESTs 


6.73 




435604 


AA625279 


Hs56892 


uncharacterized bone marrow protein BM04 


6.73 


5 


419083 


AI479560 


Hs.98613 


Homo sapiens cDNA FU12292 fis, clone MA 


6.72 




418245 


AA088767 


Hs.83883 


transmembrane, prostate androgen induced 


6.70 




420714 


BE172704 


Hs522746 


KIAA1610 protein 


6.70 




412707 


AW206373 


Hs.16443 


Homo sapiens cDNA: FU21721 fis, done C 


6.67 




421896 


N62293 


Hs.45107 


ESTs 


6.66 


10 


411078 


AI222020 


Hs.182364 


CoooaCrisp 


6.66 




452465 


AA610211 


Hs34244 


ESTs 


6.66 




422763 


AA033699 


Hs.83938 


ESTs, Moderateiy similar to MAS2LHUMAN M 


6.66 




444618 


AV653785 


Hs.300171 


ELL-RELATED RNA POLYMERASE il, ELONGATIO 


6.64 
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NM.004915 


Hs.10237 


ATP-binding cassette, sub-family G (WHIT 


5.31 


410150 


AW382942 


Hs.6774 


ESTs 


5.30 


423952 


AW877787 


Hs.136102 


KIAA0853 protein 


5.30 


452822 


X85689 


Hs288617 


hypothetical protein FU22621 


5.30 


447752 


M73700 


Hs.347 


lactotransferrin 


529 


441766 


R53790 


Hs23294 


hypothetical protein FU 14393 


529 


431359 


AW993522 


Hs292934 


ESTs 


527 


427212 


AW293849 


Hs^8279 


ESTs, Weakly similar to ALU7_HUMAN ALU S 


527 


449916 


T60525 


Hs299221 


pyruvate dehydrogenase kinase, isoenzyme 


527 


454014 


AW016670 


Hs233275 


ESTs 


527 


419714 


AA758751 


Hs.98216 


ESTs 


526 


428845 


AL157579 


Hs.153610 


KIAA0751 gene product 


526 


417333 


AL157545 


Hs.42179 


bromodomain and PHD finger containing, 3 


524 


419986 


AI345455 


Hs.78915 


GA-binding protein transcription factor, 


524 


407182 


AA312551 


Hs230157 


ESTs 


5.22 


420111 


AA255652 




gb3s21h11.r1 NCLCGAP_GCB1 Homo sapiens 


522 


428058 


AI821625 


Hs.191602 


ESTs 


522 


459551 


AI472808 




gb^70e07jc1 SoaresjNSFJSJWJ3TJ»AjP_S 


522 


432524 


AI458020 


Hs293287 


ESTs 


522 


436207 


AA334774 


Hs.12845 


hypothetical protein MGC13159 


522 


410870 


U81599 


Hs.66731 


homeo box B13 


522 


451418 


BE387790 


Hs26369 


hypothetical protein RJ20287 


522 


409757 


NMJX)1898 


Hs.123114 


cystatin SN 


521 


441124 


T97717 


Hs.1 19563 


ESTs 


521 


428593 


AW207440 


Hs.185973 


degenerative spermatocyte (homoiog Droso 


521 


436401 


AI087958 


Hs-29088 


ESTs 


520 


437113 


AA744693 




gb:ny26c10.s1 NCI_CGAP_GCB1 Homo sapiens 


520 


450947 


AI745400 


HS204662 


ESTs 


520 


453279 


AW893940 


Hs.59698 


ESTs 


520 


445467 


AI239832 


Hs.15617 


ESTs, Weakly similar to ALU4_HUMAN ALU S 


5.19 


448944 


AB014605 


HS22599 


atrophin-1 interacting protein 1; actrvi 


5.19 


412198 


AA937111 


Hs.69165 


ESTs 


5.18 


422646 


H87863 


Hs.151380 


ESTs, Weakly similar to T1 6584 hypotheti 


5.18 


438986 


AF085888 


Hs269307 


ESTs 


5.18 


453954 


AW116336 


Hs.75251 


DEAD/H (Asp-Glu-Aia-Asp/His) box binding 


5.18 


447541 


AK000288 


Hs.18800 


hypothetical protein RJ20281 


5.18 


434029 


AA621763 


Hs.170434 


Homo sapiens cDNA FU14242 fis, clone OV 


5.16 


459294 


AW977286 


Hs.169531 


RBPI-ffce protein 


5.16 


429441 


AJ224172 


H&204096 


lipophflin B (uteroglobin family member) 


5.16 


424692 


AA429834 


Hs.151791 


KIAA0092 gene product 


5.15 


427359 


AW020782 


Hs.79881 


Homo sapiens cDNA: FU23006 fis, clone L 


5.15 


419872 


AI422951 


Hs.146162 


ESTs 


5.15 


429422 


AK001494 


Hs202596 


Homo sapiens cDNA FU10632 fis, clone NT * 


5.14 


448902 


245998 


Hs22543 


Homo sapiens mRNA; cDNA DKFZp761i1912 (f . 


5.14 


459055 


N23235 


Hs.30567 


ESTs, Weakly similar to B34087 hypotheti 


5.14 


431318 


AA502700 


Hs293147 


ESTs, Moderately similar to A46010 X-iin 


5.14 


452953 


AI932884 


Hs271741 


ESTs, Weakly similar to A46010 X-Gnked 


5.13 


428372 


AK000684 


Hs.183887 


hypofteticai protein FU22104 


5.12 


434401 


AI864131 


Hs.71119 


Putative prostate cancer tumor suppresso 


5.12 


416434 


AW1 63045 


Hs.79334 


nuclear factor, interleukin 3 regulated 


5.11 


410268 


AA316181 


Hs.61635 


six transmembrane epithelial antigen of 


5.10 


417517 


AF001176 


Hs.82238 


POP4 (processing of precursor , S. cerev 


5.10 


453616 


NM_003462 


Hs.33846 


dynein, axonemal, light intermediate pol 


5.10 


427958 


AA418000 


Hs.98280 


potassium intermediate/small conductance 


5.09 


407945 


X69208 


Hs.606 


ATPase, Cu++ transporting, alpha polypep 


5.08 


425154 


NA/UXJ1851 


Hs.154850 


collagen, type IX, alpha 1 


5.08 


412863 


AA121673 


Hs.59757 


zinc finger protein 281 


5.06 


420807 


AA280627 


Hs£7846 


ESTs 


5.06 


430568 


AA769221 


HSJ270847 


delta-tubufln 


5.06 
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H40104 
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418576 


AW968159 


HS.2B9104 


Alu-blndtng protein with zinc finger dom 


C AC 

O.UD 


D 


A-IOQOO 
4lO0ZtJ 


\MC7O0 

T IO/2o 


Ue TC90G 


guanyiaie cyclase 1 , soiuuie, aipna 0 






414271 


AK000275 


11. TCQ74 

HS.75871 


protein kinase C binding protein 1 


c /w- 
O.W 




432729 


AKQQ0292 




kiinnfhndivit nrvttnrn CI IOAO0C 

nypotneucai protein rLJ2U2o5 






433433 


Al 692623 


Hs.121513 


Homo sapiens clone £3-1 placenta expres 


O.U4 




439662 


H97552 




fcoiS 




11) 


439743 


AL389956 


Hs283858 


Homo sapiens mRNA fuii length insert cdn 


O.U4 




417511 


AL0491/D 


Li* 00000 


chord in-like 


C AO 
O.U& 




437814 


AI088192 


M . 4 nr At A 

Hs.1 35474 


rpT. *«t»,:i^-»— rM^vn lm imam atd r\ 

cois, weawy similar to dua9_human a j r-u 






426342 


AF093419 


Hs.1 69378 


multiple PDZ domain protein 


C AO 

5.02 


15 


429782 


NM_005754 


Hs220689 


Ras-GTPase-activating protein SH3-domain 


5.02 


429975 


A1167145 


Hs.1 65538 


ESTs 


e AO 
5.02 




436209 


AW850417 


H&254020 


ESTs, Moderately similar to unnamed prot 


5.02 




438571 


AW020775 


Hs.56022 


ESTs 


C AO 

5.U2 




450223 


AM18204 


H&241493 


natural killer-tumor recognition sequenc 


C AO 

O.U2 


20 


408267 


AW380525 


Hs.267705 


tubulin-specific chaperone e 


5.01 


417730 


Z44761 




gb:HSC28F061 normalized infant brain cDN 


e aa 

5.00 




425465 


L18964 


Hs.1 904 


protein kinase C t iota 


r aa 

o.w 




430599 


NM004855 


Hs.247118 


phosphatidytinositol gtycan, dass B 


C AA 

5.00 




450961 


AW978813 


Hs£50867 


metaiiothionein 1E (functional) 


C AA 

5.00 


25 


451386 


AB029006 


Hs26334 


spastic paraplegia 4 (autosomal dominant 


C AA 

5.00 


420380 


AA640891 


Hs.1024d6 


ESTs 


A AA 

4.99 




424947 


R77952 


HS239625 


ESTs, Weakly similar to alternatively sp 


4.99 




442653 


BE269247 


Hs.170226 


gb:601185486F1 NlH_MGC_8 Homo sapiens cD 


4.98 




457211 


AW972565 


Hs.32399 


ESTs, Weakly similar to S51797 vasodilat 


4.97 


30 


425851 


NM_001490 


Hs.159642 


glucosaminyl (N-acetyl) transferase 1, c 


4.97 


446279 


AA490770 


Hs.182382 


ESTs 


4.96 




433377 


A1752713 


Hs.43845 


ESTs 


4.96 




450218 


R02018 . 


Hs.168640 


ankylosis, progressive (mouse) homoiog 


4.96 




412715 


NMJJ00947 


Hs.74519 


primase, polypeptide 2A (58kD) 


4.94 


35 


* 448164 


R61680 


Hs.26904 


ESTs, Moderately similar to Z195_HUMAN Z 


4.94 


420121 


AW968271 


Hs.191534 


ESTs, Weakly similar to ALU INHUMAN ALU S 


4.94 




421689 


N87820 


Hs.106826 


I/I 1 A J AAA * _ 

KIAA1696 protein 


4.93 




445808 


AV655234 


Hs298083 


ESTs, Moderately similar to PC4259 fern 


4.92 




416533 


BE244053 


Hs.79362 


retinoblastoma-Iike 2 (p130) 


4.92 


a r\ 

40 


418049 


AA211467 


Hs.1 90488 


Homo sapiens, Similar to nuclear localiz 


A OO 

4.92 


436039 


AW023323 


Hs.121070 


ESTs 


4.92 




432653 


N62096 


H&293185 


ESTs, Weakly similar to JC7328 amino ad 


A AH 

4^1 




420324 


AF163474 


Hs.96744 


prostate androgerwegutated transcript 1 


4.91 




403047 








4.91 


45 


436899 


AA764852 


Hs291567 


ESTs 


A AA 

4.90 


431117 


AF003522 


HS250500 


delta (Drosopnila)-Oke 1 


A AA 




427617 


D42063 


Hs.179825 


RAN binding protein 2 


A CO 




428804 


AKG00713 


Hs.193736 


hypothetical protein HJ20706 


4.00 




433050 


AI093930 


Hs.163440 


Homo sapiens cDNA: FLI21000 fis, done c 


4.0O 


jU 


418575 


AA225313 


Hs.222886 


ESTs, Weakly similar to TRHY_HUMAN TRICH 


A OC 

4.00 


432615 


AA557191 


HS55028 


ESTs, Weakly similar to 154374 gene Nrz 


4.00 




412652 


AI801777 


Hs.6774 


coTS 


4.00 




432473 


AI2Q2703 


Hs.152414 


ESTs 


4.00 




449071 


NMj005872 


Hs.22960 


breast carcinoma amplified sequence 2 


AAA 
4.00 




450654 


AJ245587 


Hs.25275 


KruppeMype zinc finger protein 


4.00 


418866 


T65754 


Hs.100489 


goyci icu/.s i otratagene lung (9o72iu) 11 


4XJ0 




407596 


R86913 




gbyo30f05.r1 Soares fetal liver spleen 


4.04 




456516 


BE172704 


Hs.222746 


isi a A-4C4A mm!aI« 

KiAAiolO protein 


A OA 
4.04 










ESTs 


4.84 


60 


448730 


AB032983 


Hs.21894 


KIAA1157 protein 


4.84 


458339 


AW976853 


Hs.172843 


ESTs 


4.83 




422083 


NM.001141 


Hs.111256 


arachidonate 15-lipoxygenase, second typ 


4.82 




420159 


AI572490 


Hs.99785 


Homo sapiens cONA: FU21245 fis, done C 


4.82 




424103 


NM..001918 


Hs.139410 


dihydrolipoamide branched chain transacy 


4.82 


65 


449535 


W15267 


Hs.23672 


low density lipoprotein receptor-related 


4.82 


422048 


NM.012445 


Hs.288126 


spondin 2, extracellular matrix protein 


4.82 




416737 


AF154335 


HS.79691 


LfM domain protein 


4.82 




419972 


AL041465 


Hs.294038 


golgin-67 


4.81 




420235 


AA256756 


Hs.31178 


ESTs 


4.81 




423412 


AF109300 


Hs.147924 


prostate cancer associated protein 5 


4.80 
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AAo 1 120/ 
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AIOO1C0C 

Alo21o2b 


HS. 191 602 


eSTS 


/ an 


A04QOQ 

421o2o 


AWoolboo 


HS2B9109 


histone deacetylase 3 


4,/y 


424602 


AK002055 


HS.3G1129 


hypothetical protein rUl 1 193 


A TO 


AOQOEA 

42ooo4 


AA420000 


HS.lb0o4l 


rCT e kinAAmiaUt c>m!tart/\ Al 1 H 141 IMAM A 

to 1 S, MOaGiHISIy SSTttiar 10 ALUl_nUMAN A 


A 7fl 


452335 


AW1oo944 


HS.61272 


fcoTS 


A 7C 
4.f0 


A 1 fYTCC 

41U765 


A! 694372 


m _ coa on 
HS.66180 


nucleosome assembly protein Hike 2 




421040 


AA715026 


Hs.135280 


ESTs 


4.70 


421518 


Al 056392 


HS.2Q8B19 


ESTs 


4./0 


452560 


BE077084 




ESTs 


4.76 


409752 


AW963990 




gD:cST376u63 MAuc resequences, magh Homo 


A 

4./0 


439703 


AF086538 


HS.1 96245 


ESTs 


4.7o 


418636 


Al 655499 


Hs.161712 


ESTs 


4.74 


450642 


R39773 


Hs.7130 


copine IV 


A TA 

4.74 


419879 


Z17805 


Hs.93564 


Homer, neuronal immediate early gene, 2 


4.74 


411440 


AW7494Q2 




gb:QV4-BT0383-281 299-061 *c06 BT0383 Homo 


474 


450649 


fell A AA^ 1AA 

NM_001429 


HsJ297722 


E1 A binding protein p300 


474 


408738 


NM_014785 


Hs.47313 


KIAA0258 gene product 


4.73 


435020 


AW505076 


Hs.301855 


DiGeorge syndrome critical region gene 8 


472 


411624 


BE145964 




KIAA0594 protein 


472 


439360 


AA448488 


Hs.55346 


ribosomal protein L44 


472 


440491 


R35252 


Hs.24944 


ESTs, Weakly similar to 2 1 09260 A B cell 


4.72 


442611 


BE077155 


Hs.1 77537 


hypothetical protein DKFZp761B1514 


4.72 


443555 


N71710 


Hs.21398 


ESTs, Moderately similar to A Cham A, H 


4.72 


453800 


BE300741 


Hs.288416 


hypothetical protein HJ13340 


4.72 


457528 


AW973791 


Hs.292784 


ESTs 


4.72 


416795 


AI497778 


Hs.168053 


HBVpX associated protein-8 


4.71 


407302 


R74206 


Hs.268755 


ESTs, Weakly similar to I78885 serine/th 


4.71 


404721 








4.70 


426261 


AW242243 


Hs.1 68670 


peroxisomal famesylated protein 


4.70 


431924 


AK000850 


Hs.272203 


Homo sapiens cDNA RJ20843 fis, clone AD 


470 


435256 


AF193766 


Hs.13872 


cytokine-like protein C17 


470 


438295 


AI394151 


Hs.37932 


ESTs 


470 


442655 


AW027457 


Hs.30323 


ESTs, Weakly similar to B34087 hypotheti 


4.70 


415788 


AW628686 


Hs.78851 


KIAAQ217 protein 


4.69 


442760 


BE075297 


Hs.10067 


ESTs, Weakly similar to A43932 mucin 2 p 


4.69 


432432 


AA541323 


Hs.1 15831 


ESTs 


4.68 


454398 


AA463437 


Hs.1 1556 


Homo sapiens cDNA RJ 12566 fis, done NT 


4.68 


452741 


BE392914 


Hs.30503 


Homo sapiens cDNA FIJ1 1344 fis, clone PL 


4.67 


424853 


BE549737 


Hs.132967 


Human EST clone 122887 mariner transpose 


4.67 


419706 


C04649 


Hs.77899 


tropomyosin 1 (alpha) 


4.66 


412088 


AI689496 


Hs.1 08932 


ESTs 


4.65 


416276 


U41060 


Hs.79136 


UV-1 protein, estrogen regulated 


4.64 


429281 


AA830856 


Hs.29808 


Homo sapiens cDNA: FU21 122 fis, clone C 


4.64 


448207 


AI475490 


Hs.170577 


ESTs 


4.64 


408374 


AW025430 


Hs.155591 


forkhead box F1 


4.64 


447162 


BE328091 


Hs,157396 


ESTs, Weakly similar to A46010 X-Gnked 


4.64 


451900 


AB023199 


Hs.27207 


KIAA0982 protein 


4.63 


421437 


AW821252 


Hs.104336 


hypothetical protein 


4.63 


418624 


AI734080 


Hs.104211 


ESTs 


4.63 


426172 


AA371307 


Hs.125056 


ESTs 


4.62 


439831 


AW136488 


Hs.25545 


ESTs 


4.61 


452994 


AW962597 


Hs.31305 


KIAA1547 protein 


4.61 


457726 


AI217477 


Hs.194591 


ESTs 


4.60 


434629 


AA789081 


Hs.4029 


glioma-ampfified sequence-41 


a an 
4.00 


403764 
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ESTs 
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458 


451246 


AW189232 


Hs.39140 


cutaneous T-cell lymphoma tumor antigen 


4.58 


433234 


AB040928 


Hs.65366 


KIAA1495 protein 


457 


424983 


A1742434 


Hs.169911 


ESTs 


4.56 


437812 


AI582291 


Hs.16846 


ESTs, WeaWy similar to 04HUD1 debrisoqu 


456 


438447 


AI082883 


Hs.167593 


hypothetical protein FU13409; KIAA1711 


455 


434715 


BE005346 


Hs.116410 


ESTs 


455 


447673 


A1823987 


Hs.182285 


ESTs 


454 


408897 


N50204 


Hs.283709 


^polysaccharide specific response-7 p 


454 


436645 


AW023424 


Hs.156520 


ESTs 


454 


421247 


BE391727 


Ks.102910 


general transcription factor IIH, polype 


453 


450377 


AB033091 


Hs.24936 


KIAA1265 protein 
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433644 AW34202B Hs256112 gb:hb75d03j(1 NCIjCGAPJJE Homo sapiens 4.53 

408321 AW405B82 Hs.44205 cortistatin 4.53 

439225 AA192669 Hs45032 ESTs 4.52 

440348 AW015802 Hs.47023 ESTs 452 

5 446351 AW444551 Hs258532 x001 protein 452 

451212 AW902672 Hs287334 ESTs 452 

430294 AI538226 Hs.135184 guanine nucleotide binding protein 4 452 

435005 U80743 Hs.4316 trinucleotide repeat containing 12 452 

448072 AI459306 Hs24908 ESTs 450 

10 403721 450 

451018 AW965599 Hs247324 mitochondrial ribosomal protein S 14 450 

453070 AK001465 Hs31575 SEC63, endoplasmic reticulum translocon 4.49 

417412 X16898 Hs52112 interieukin 1 receptor, type I 4.48 

439735 A1635386 Hs.142846 hypothetical protein 4.48 

15 435663 AI023707 Hs.134273 ESTs 4.48 

424036 AA770688 Hs31946 H2A histone family, member L 4.48 

426386 AA748850 Hs.174877 bladder cancer overexpressed protein 4.48 

408622 AA056060 Hs202577 Homo sapiens cDNA FU12166 fis, clone MA 4.47 

444269 AI590346 Hs.146220 ESTs 4.47 

20 430187 A1799909 Hs.158989 ESTs 4.46 

427761 AA412205 Hs.140996 ESTs 4.46 

430261 AA305127 Hs237225 hypothetical protein HT023 4.46 

444169 AV648170 Hs58756 ESTs 4.44 

430598 AK001764 Hs247112 hypothetical protein RJ1 0902 4.44 

25 412903 BE007967 Hs.155795 ESTs 4.44 

417048 AI088775 Hs.55498 geranylgeranyl diphosphate synthase 1 4.44 

442710 AI015631 Hs23210 ESTs 4.44 

457413 AA743462 Hs.165337 ESTs 4.44 

400303 AA242758 Hs.79136 UV-1 protein, estrogen regulated 4.42 

30 443268 AI800271 Hs.129445 hypothetical protein RJ12496 4.42 

438209 AL120659 Hs.6111 aryl-hydrocarbon receptor nuclear transl 4.42 

431724 AA514535 Hs283704 ESTs 4.41 

412280 AW205116 Hs272814 hypothetical protein DKFZp434E1723 4.40 

440801 AA906366 Hs.190535 ESTs 4.40 

35 452959 AI933416 Hs.189674 ESTs 4.40 

453861 AI026838 Hs30120 ESTs, Weakly similar to NUCLJHUMAN NUCLE 4.40 

417421 AL138201 Hs.82120 nuclear receptor subfamily 4, group A, m 4.40 

447270 AC0Q2551 Hs.331 general transcription factor IIIC, polyp 4.38 

433641 AF080229 gb:Human endogenous retrovirus K clone 1 4.38 

40 447078 AW885727 Hs301570 ESTs 4.38 

424242 AA337476 hypothetical protein MGC13102 4.37 

408170 AW204516 Hs31835 ESTs 4.36 

448757 AI366784 Hs.48820 TATA box binding protein (TBPJ-associate . 4.36 

420021 AA252848 Hs293557 ESTs 4.36 

45 449694 AI659790 H&253302 ESTs 436 

453867 AI929383 Hs.108196 hypothetical protein DKFZp434N185 4.36 

458712 AI347502 Hs.173066 hypothetical protein RJ20761 4.36 

417251 AW015242 Hs.99488 ESTs, Weakly similar to YK54_YEAST HYPOT 435 

434423 NMJXJ6769 Hs.3844 UM domain only 4 4.35 

50 423427 AL137612 Hs285848 KIAA1454 protein 434 

415715 F30364 ESTs 433 

404561 432 

422969 AA782536 Hs.122647 N^yristoyttransferase 2 - 4.32 

423685 BE350494 Hs.49753 uveal autoanfigen with coited coil domai 4.32 

55 443977 AL120986 Hs.150627 ESTs, Weakly similar to I38022 hypotheti 4.32 

425071 NM.013989 Hs.154424 deiodinase, lodothyronine, type II 432 

431583 AL042613 Hs262476 S-adenosylmethionine decarboxylase 1 4.31 

411379 AI816344 Hs.12554 ESTs, Weakly similar to NPL4_HUMAN NUCLE 4.30 

421476 AW953805 Hs21887 ESTs 4.30 

60 425178 H16097 Hs.161027 ESTs 4.30 

439262 AA832333 Hs.124399 ESTs 4.30 

442818 AK001741 Hs.8739 hypothetical protein FU10879 4.30 

421977 W94197 Hs.1 10165 ribosomal protein L26 homolog 429 
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Hs.16063 


hypothetical protein RJ21877 


3.71 




449897 


AW819642 


Hs.24135 


transmembrane protein vezatin; hypotheti 


371 




420297 


A1628272 


Hs.88323 


f 1 Af ^ ul |l k * f| ■ *| 1 I,J I If I! J Ail jh 1 1 1 

ESTs, Weakly similar to ALU1_HUMAN ALU S 


3.70 


25 


423065 


R96158 


Hs.194606 


Homo sapiens, done MGC5406, mRNA, comp 


3.70 


429340 


N35938 


Hs.199429 


Homo sapiens mRNA; cONA DKFZp434M2216 (f 


3.70 




437777 


AA768098 


Hs.189079 


ESTs 


370 




. 440351 


AF030933 


Hs.7179 


RAD1 (S. pornbe) homolog 


370 




443603 


BE502601 


Hs.134289 


ESTs, Weakly similar to KIAA1063 protein 


3.70 


30 


446965 


BE242873 


Hs.16677 


WD repeat domain 15 


370 


412350 


A1659306 


Hs.73826 


protein tyrosine phosphatase, non*recept 


370 




433852 


A1378329 


Hs.126629 


ESTs 


3.70 




433142 


AL1 20697 


Hs.1 10640 


ESTs 


3.69 




419994 


AA282881 


Hs.190057 


ESTs 


3.69 


35 


412628 


AI9724Q2 


Hs.173902 


hypothetical protein MGC2648 


3.69 


431416 


AA532718 


Hs.178604 


ESTs 


3.69 




439444 


AI277652 


Hs.54578 


ESTs, Weakly similar to 138022 hypotheti 


3.68 




414709 


M704703 


Hs.77031 


Sp2 transcription factor 


3.68 




447397 


BE247676 


Ks.18442 


E-1 enzyme 


3.68 


40 


405718 








3.68 


425217 


AU076696 


Hs.155174 


CDC5 (cell division cycle 5, S. pombe, h 


3.68 




442242 


AV647908 


Hs.90424 


Homo sapiens cONA: RJ23285 fts, clone H 


3.68 




424690 


BE538356 


Hs.151777 


eukaryofic translation initiation factor 


3.68 




421734 


AI318624 


Hs.107444 


Homo sapiens cONA FU20562 fis, done KA 


3.67 


AC 

45 


427221 


L15409 


Hs.174007 


von HippeWndau syndrome 


3.67 


439864 
402408 


AI720078 


H&291997 


ESTs, Weakly similar to A47582 B-cell gr 


3.66 
3.66 




426327 


W03242 


Hs.44898 


Homo sapiens done TCCCTA00151 mRNA sequ 


3.66 




427119 


AW880562 


Hs.1 14574 


ESTs 


3.66 


<n 
DU 


427356 


AW023482 


Hs.97849 


ESTs 


3.66 


452946 


X95425 


Hs.31092 


EphA5 


3.66 




419078 


M93119 


H&.89584 


insufinoma-assodated 1 


3.00 




416295 


AI064824 


Hs.1 93385 


ESTs 


3.65 




427144 


X95097 


Hs.2126 


vasoactive intestinal peptide receptor 2 


3.65 




447500 


AI381900 


Hs.1 59212 


ESTs 


3.65 


453127 


AI696671 


n A nAi 4 4 n 

Hs.294110 


ESTs 


o.oo 




423396 


AJ382555 


Hs.127950 


bromodomaln-contaming 1 


o CC 
O.OO 




419346 


AI830417 




polybromo 1 


O CA 

3.04 




441540 


CO loo7 


nS.12712o 


tSTS 


O.04 


60 


446501 


AI302616 


Hs.150819 


ESTs 


3.64 


459527 


AW977556 


Hs^91735 


ESTs, Weakly similar to 178885 serine/th 


3.63 




446320 


AF126245 


Hs.14791 


acyl-Coenzyme A dehydrogenase family, me 


3.63 




435706 


W31254 


Hs.7045 


GL004 protein 


3.63 




400110 








3.62 


65 


410313 


R10305 


Hs.185683 


ESTs 


3.62 


414713 


BE465243 


Hs.12664 


ESTs 


3.62 




436279 


AW900372 


Hs.180793 


ESTs, Weakly similar to S65657 alph*1C- 


3.62 




439818 


AL360137 


Hs.19934 


Homo sapiens mRNA hill length Insert cON 


3.62 




451797 


AW663858 


H&56120 


small inducible cytokine subfamily E, me 


3.62 




451294 


AM57338 


Hs.29894 


ESTs 


3.62 
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434194 


AF1 19847 


Hs.283940 


Homo sapiens PRO1550 mRNA, partial cds 


3.62 




404939 








3.62 




408101 


AW968504 


Hs.123073 


CDC2-related protein kinase 7 


3.62 




435846 


AA700870 


Hs.14304 


ESTs 


3.61 


5 


432833 


N51075 


Hs,47191 


ESTs 


3.61 




427276 


AA400269 


Hs.49598 


ESTs 


3.61 




433495 


AW373784 


Hs.71 


alpha-2-glycoprotein 1 , zinc 


3.60 




403137 








3.60 


,10 


404165 








3.60 


409571 


AA504249 


Hs.187585 


ESTs 


3.60 




410561 


BE540255 


Hs.6994 


Homo sapiens cDNA: FU22044 fis, clone H 


3.60 




412924 


BE018422 


Hs.75258 


H2A histone family, member Y 


3.60 




434228 


Z42047 


Hs.283978 


Homo sapiens PR02751 mRNA, complete cds 


3.60 


15 


436797 


AA731491 


Hs.178518 


hypothetical protein MGC14879 


3.60 


437162 


, AW005505 


Hs3464 


thyroid hormone receptor coactivating pr 


3.60 




437444 


H46008 


Hs31518 


ESTs 


3.60 




404210 








3.59 




446157 


BE270828 


Hs.131740 


Homo sapiens cDNA: FU22562 fis, clone H 


3.59 


20 


437587 


AI591222 


Hs.122421 


Human DNA sequence from done RP1-187J1 1 


3.58 


423147 


AA987927 


Hs.131740 


Homo sapiens cDNA: RJ22562 fis, clone H 


3.57 




452226 


AA024898 


Hs.296002 


ESTs 


3.56 




443775 


AF291664 


Hs£04732 


matrix metaUoproteinase 26 


3.56 




452501 


AB037791 


Hs.29716 


hypothetical protein FU10980 


3.56 


25 


428647 


AA830050 


Hs.124344 


ESTs 


3.56 


422443 


NM 014707 


Hs.116753 


histara deacetytase7B 


3.55 




447966 


AA340605 


Hs,105887 


ESTs, Weakly similar to Homotog of rat Z 


3.55 




420892 


AW975076 


Hs.172589 


nuclear phosphoprotein similar to S. cer 


3.55 




420230 


AL034344 


Hs.298020 


forkhead box C1 


3.55 


30 


418428 


Y12490 


Hs.85092 


thyroid hormone receptor interactor 1 1 


3.54 


428949 


AA442153 


Hs.104744 


hypothetical protein OKFZp434J0617 


3.54 




444929 


AI685841 


Hs.161354 


ESTs 


3.54 




433339 


AF019226 


Hs.8036 


glioblastoma overexpressed 


3.54 




424369 


R87622 


Hs£6714 


KIAA1831 protein 


3.54 


35 


433002 


AF048730 


Hs.279906 


cyctinTI 


333 


435425 


H16263 


Hs.31416 


ESTs 


333 




415621 


AI648602 


Hs.131189 


ESTs 


333 




416974 


AF010233 


Hs.80667 


RALBP1 associated Eps domain containing 


333 




405793 








332 


40 


409770 


AW499536 




gb:UWF-BR0p-ap-c-12^HJLr1 NIH_MGC_5 


3.52 


425305 


AA363025 


Hs.155572 


Human done 23801 mRNA sequence 


332 




428939 


AW236550 


Hs.131914 


ESTs 


332 




438388 


AA806349 


Hs.44698 


ESTs 


332 




443703 


AV646177 


H&213021 


ESTs 


a52 


45 


457940 


AL360159 


Hs.30445 


Homo sapiens TRipartite motif protein ps 


332 


402444 








332 




409643 


AW450866 


Hs.257359 


ESTs 


331 




418250 


U29926 


Hs.83918 


adenosine monophosphate deaminase (isofo 


331 




432745 


AI821926 


Hs-269507 


gbJif7Bf05j6 NClCGAP_Pr3 Homo sapiens 


331 


50 


414222 


AL135173 


H&878 


sorbitol dehydrogenase 


331 


430061 


AB037817 


Hs.230188 


KIAA1396 protein 


331 




421491 


H99999 


Hs.42736 


ESTs 


3.50 




422384 


AA224077 


Hs.42438 


Sm protein F 


330 




434565 


T52172 




ESTs 


3.50 


55 


438379 


N23018 


Hs.171391 


C-terminal binding protein 2 


330 


439741 


BE379646 


Hs.6904 


Homo sapiens mRNA full length insert cDN 


330 




447311 


R37010 


Hs.33417 


Homo sapiens cDNA: FU22806 fis, done K 


330 




447805 


AW627932 


Hs.19614 


gemin4 


330 




454265 


H03556 


Hs.300949 


EST s, Weakly similar to thyroid hormone 


330 


60 


418838 


AW385224 


Hs.35198 


ectonudeotide pyrophosphatase/phosphodi 


3.50 


448804 


AW512213 


Hs.42500 


ADP-ribosylation factor-like 5 


3.50 




409617 


BE003760 


Hs35209 


Homo sapiens mRNA; cDNA DKFZp434K0514 (f 


3.49 




434075 


AW003416 


Hs.160604 


ESTs 


3.49 




444190 


AI878918 


Hs.10526 


cysteine and glycine-rich protein 2 


3.49 


65 


435017 


AA336522 


Hs.12854 


angiotensin II, type 1 receptor-associat 


3.48 


423445 


NM014324 


Hs.128749 


aipha-methylacyl-CoA racemase 


3.48 




420271 


AI954365 


Hs.42892 


ESTs 


3.48 




443684 


AI681307 


Hs.166674 


ESTs 


3.48 




444168 


AW379879 




gb:RC1-HT0256-081 199-01 1-f01 HT0256Homo 


3.48 




446074 


AA079799 


HS29263 


hypothetical protein FU11896 


3.48 



158 



WO 02/30268 



PCT/US01/32045 



452582 


AL137407 


Hs.29911 


Homo sapiens mRNA; cDNA DKF2p434M232 (fr 


3.48 


431542 


H63010 


Hs.5740 


ESTs 


3.48 


432697 


AW975050 


Hs.293892 


ESTs, WeaWy similar to ALU4.HUMAN ALU S 


3.48 


435572 


AW975339 ' 


Hs.239828 


ESTs, Weakly similar to GAG2.HUMAN RETRO 


3.47 


407192 


AA609200 




gbaf12eQ2.s1 SoaresJestis_NHT Homo sap 


3.47 


413435 


X51405 


Hs.75360 


carboxypeptidase E 


3.46 


447210 


AF035269 


Hs.17752 


phosphatidyiserine-spedfic phospholipas 


3.46 


447958 


AW796524 


Hs.68644 


Homo sapiens microsomal signal peptidase 


3.46 


425312 


AA354940 


Hs.145958 


ESTs 


3.46 


442007 


AA301116 


Hs.142838 


nucleolar phosphoproteln Nopp34 


3.46 


417455 


AW007066 


Hs.18949 


ESTs, WeaWy similar to CA2BJWMAN COLLA 


3.45 


426931 


NMJXB416 


Hs.2076 


zinc finger protein 7 (KOX 4, clone HF.1 


3.45 


408739 


W01556 


Hs.238797 


ESTs, Moderately similar to 138022 hypot 


3.45 


436024 


AI800041 


Hs.190555 


ESTs 


3.45 


408418 


AW963897 


Hs.44743 


KIAA1435 protein 


3.45 


409151 


AA306105 


Hs.50785 


SEC22, vesicle trafficking protein (S. c 


3.44 


418626 


AW299508 


Hs.135230 


ESTs 


3.44 


420560 


AW207748 


Hs.59115 


ESTs 


3.44 


420686 


AI950339 


Hs.40782 


ESTs 


3.44 


428870 


AA436831 


Hs.36049 


ESTs 


3.44 


436754 


AI061288 


Hs.133437 


ESTs 


3.44 


437960 


AI669586 


Hs.222194 


ESTs 


3.44 


452300 


AW628045 


Hs.28896 


Homo sapiens mRNA full length insert cDN 


3.44 


421887 


AW161450 


Hs.109201 


CGWJ6 protein 


3.44 
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TABLE 5A shows the accession numbers for those primekeys lacking a unigeneK) in Tables 
5, 6, and 7. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed Gene clusters were compiled using sequences derived from 
5 Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



CAT number 
Accession: 



10 



15 



20 



25 



30 



35 



40 



45 



50 433687 
433891 
434415 
434565 
434804 

55 437113 
444168 
448212 
448310 
451746 



Unique Eos probeset identifier number 

Gene cluster number 

Genbank accession numbers 



407596 
408432 
409752 
409770 
411440 
411479 

411624 
412991 
414269 
415123 
415715 
416288 
416289 
417730 
418636 
419346 
419536 
420111 
422219 
424179 
424242 
428002 
429163 
432189 
432340 
432363 



433586 
433641 



CAT number 

1003489J 

1058667J 

115301J 

1154048 J 

124577J 

1247077J 

1252166J 

134248J 

143133J 

1523390J 

154881 8 J 

1585983 J 

1586037J 

1695795J 

177402J 

184129J 

185688J 

190755J 

213547J 

236389J 

237181J 

285602J 

300543J 

34281 9_1 

345248J 

345469J 

356839J 

370470J 

37186J . 



373061J 

376239.1 

385931J 

38898J 

393481 1 

433234.1 

593829.1 

755099.1 

757918.1 

883303.1 



R86913 R86901 H25352 R01370H43764 AW044451 W21298 
AW1 95262 R27868 AW81 1262 

AW963990 AA078196 AW749482 AA077468 BE151571 AA376917 
AW499536 AW499553 AW502138 AW499537 AW502136 AW501743 
AW749402 AW749403 Z45743 R80376 AAD93358 

AW848047 AW848202 AW848631 AW848142 AW848702 AW848121 AW848632 AW848140 AW848571 

AW848009 AW848067 AW848069 AW848905 AW848214 

BE145964 BE146286 AW854564 

AW949013AA126111 

AA298489AA137165 

D60925 D60828 D80787 

F30364 F36559T15435 

H51299 H44619 H46391 R86024 H51892 T72744 

W26333 R05358 H44682 

Z44761 R25801 R11926R35604 

AW749855 AA225995 AW750208 AW750206 

AI830417AA236612 

AA603305 AA244095 AA244183 

AA255652 AA28091 1 AW967920 AA262684 

AW978073 AW978072 AA807550 AA306567 

F30712 F35665 AW263888 AI904014 AI904O18 AA336927 AA336502 

AA337476 AW966227 AA450376 AW960222 AA381051 

AM18703AA418711 BE071915BE071920BE071912 

AA884766 AW974271 AA592975 AA447312 

AA527941 AI810608 AI620190 AA635266 

AA534222 AA632632 T81234 

AA534489AW97024OAW970323 

AA650114 AW974148 AA572946 

T85301 AW517087 AA601054 BE073959 

AF080229 AF080231 AF080230 AF080232 AF080233 AF080234 BE550633 AI636743 AW614951 BE467547 
AI680833 AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 
AI583718 AI672574 N25695 AW665466 A1818326 AA126128 AM80345 AW013827 AA248638 AI214968 
AA204735 AA207155 AA206262 AA204833 AW003247 AW496808 AI080480 AI631703 AI651023 AI867418 
AW818140 AA502500 AI206199 AI671282 AI352545 BE501030 A1652535 BE465762 AA206331 AW451866 
AA471088 AA206342 AA204834 AA206100 AW021661 AA332922 N66048 AA703396 H92278 AW139734 
H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 BE46661 1 AI206344 AA574397 AA348354 
AI493192 

AA743991 AA604852 AW272737 
AA613792 AW182329 T05304 AW858385 
BE177494 AW276909 AA632849 
T52172AF147324T52248 
AA649530 AA659316 H64973 
AA744693AW750059 
AW379879AI126285H12014 
AI475858AW969013 
AI480316AW847535 
M86178AI813822 D56993 
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452560 92221 6 J BE077084 AW139963AW863127AW806209AW806204 AW806205 AW806206 AW806211 AW806212 

AW806207 AW806208 AW806210 AI907497 
452712 928309 J AW838616 AW838660 BE144343 AI914520 AW888910 BE184854 BE184784 

453773 980699J AL133761 AL133767 

5 455276 1272541 1 BE176479 BE176678 BE176357 BE176550 AW886079 BE176676 BE176615 BE176555 BE176489 BE176610 

BE176362 

455309 1278153J AW894017 AW893956 AW894032 
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TABLE 5B shows the genomic positioning for those primekeys lacking unigene ED's and 
accession numbers in Tables 5, 6, and 7. For each predicted exon, we have listed the 
genomic sequence source used for prediction. Nucleotide locations of each predicted exon 
are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. Tne 7 digit numbers In this column are Genbank Identifier (Gl) numbers. "Dunham I. et aL" refers to the 

publication entitled The DNA sequence of human chromosome 22." Dunham L et aU Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposifion: indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


Nt_positton 


401045 


8117619 


Plus 


90044-90184,91111-91345 


401424 


8176894 


Pius 


24223-24428 


401451 


6634068 


Minus 


119926-121272 


401714 


6715702 


Plus 


96484-96681 


401747 


9789672 


Minus 


118596-1 18816,1 191 19-1 19244 J 19609-1 








131258,131866-131932,132451-132575,133580-134011 


401785 


7249190 


Minus 


165776-165996,166189-166314,166408-166569,167112-167268,167387-167469,168634-168942 


401819 


7467933 


Minus 


28217-28486 


402408 


9796239 


Minus 


■ 110326-110491 


402444 


9796614 


Plus 


28391-28517 


402791 


6137008 


Minus 


51036-51207 


403047 


3540153 


Minus 


59793-59968 


403137 


9211494 


Minus 


92349-92572,92958-93084,93579-93712,93949-94072^91-94748,95214-95337 


403721 


7528046 


Minus 


156647-157366 


403764 


7717105 


Minus 


118692-118853 


403797 


8099896 


Minus 


123065-125008 


404165 


9926489 


Minus 


69025-69128 


404210 


5006246 


Plus 


169926-170121 


404253 


9367202 


Minus 


55675-56055 


404561 


9795980 


Minus 


69039-70100 


404571 


7249169 


Minus 


112450-112648 


404721 


9856648 


Minus 


173763-174294 


404915 


7341766 


Minus 


100915-101087 


404939 


6862697 


Plus 


175318-175476 


405403 


6850244 


Minus 


37491-37670,40951-41031 


405685 


4508129 


Minus 


37956-38097 


405718 


9795467 


Plus 


113080-113266 


405793 


1405887 


Minus 


89197-89453 


405876 


6758747 


Plus 


39694-40031 


405917 


7712162 


Minus 


106829-107213 


406414 


9256407 


Plus 


4959349850 


406554 


7711566 


Plus 


106956-107121 
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TABLE 6:286 GENES ENCODING EXTRACELLULAR OR CELL SURFACE 
PROTEINS UP-REGULATED IN PROSTATE CANCER COMPARED TO 
NORMAL ADULT TISSUES 

Table 6 shows 286 genes up-regulated in prostate cancer compared to normal adult tissues 
5 that are likely to be extracellular or cell-surface proteins. These were selected as for Table 5 
and the predicted protein contained a structural domain that is indicative of extracellular 
localization (e.g. egf, 7tm domains). 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Pkey: 




Unique Cos probeset identifier number 




ExAccn: 




Exemplar Accession number, Genbank accession number 




UnigenelD: 




Unigene number 




Unigene Title: 


Unigene gene title 




R1: 




Ratio of tumor to normal tissue 




PKey 


ExAccn 


1 IntnonalM 
UlltycHclU 


unuiyciiu 1 lUc 


R1 


«*uyooi 




He ZAAiR 


sine ocuiis norneoDox (ur osopniiaj nomoio 


4828 


4U9/OI 


A A10CQOC 


He AZ 


uiymusjn, oeia, laenuneu in neurooiasi 


45J24 




A Aft0007Q 


Hcfi1fi<K 
no.o 1 Ooo 


six uwisTneiuDiane epiweiiai anuyen 01 


AO AD 






Uc GRAOft 




41.12 


A9R7A7 




He 171QQ*; 
no. 1 / 


KauiKiein o, {prostate specinc anugen 


0 1 JO\J 




Yft77VI 


no. 1 r 1 030 


t^o Hi brain O /nmehta ma/tlAn an(t/un 

KdiiiKiein 0, uJiuStiuo specific anugen 






MMOUOOt** 


U e 105') 
na. I OOt 


aciu pnospnaiasd, prosxaie 




AOARAR 




Uc 1A30 


neuropeptide Y 


2357 


/4nceoc 

HUODOO 








2050 


420757 




Ue GOG IK 


aiitjiugen iccepioi \uinyaroiesiosierone r 


19.72 


A IRQ Oil 




Hs 8954A 




19J56 


452792 


AB037765 




K1AA1344 nmtpin 


17^9 


445472 


AB006631 


Hs.12784 


Homn conlflnq mRhl A fnr Kl A A09Q3 nana nar 


VJQQ 


414565 


AA502972 


Uc 1R339fl 


hvnnthoHcal nmteln PI 11 1**QH 
iiypuuiouudi piuicui ri_d IOOoU 


16.82 


431716 




HS268012 


fatty-aad-Coenzyme A ligase, long-chain 


16.60 


408430 


S79876 


Hs.44926 


dipeptidyipeptidase IV (CD26, adenosine 


1628 


408000 


L11690 


Hs520 


bullous pemphigoid antigen 1 (230/240kD) 


1554 


430226 


BE245562 


Hs2551 


adrenergic, beta-2-, receptor, surface 


15.40 


444484 


AK002126 


Hs.11260 


hypothetical protein FU1 1264 


14.76 


418601 


AA279490 


Hs56368 


caimegin 


1456 


448999 


AF179274 


Hs22791 


transmembrane protein with EGF-like and 


1455 


416182 


NMJJ04354 


Hs.79069 


cycSnQ2 


12.94 


420544 


AA677577 


Hs.98732 


Homo sapiens Chromosome 16 BAC clone CIT 


12.79 


445413 


AA151342 


Hs.12677 


CGI-147 protein 


12.64 


453930 


AA419466 


Hs56727 


hypothetical protein FU10903 


1222 


440286 


U29589 


Hs.7138 


cholinergic receptor, muscarinic 3 


12.04 


452784 


BE463857 


Hs.151258 


hypothetical protein HJ21062 


1156 


450203 


AF097994 


Hs.301528 


L-kynurenine/alpha-aminoadipate aminotra 


1158 


448045 


AJ297436 


Hs20166 


prostate stem cell antigen 


1151 


449650 


AF055575 


Hs23838 


calcium channel, voltage-dependent, L ty 


11.18 


420381 


D50640 


Hs.337616 


phosphodiesterase 3B, cGMP-inhibited 


11.10 


425665 


AK001050 


Hs.159066 


hypothetical protein RJ10188 


11.08 


425710 


AF030880 


Hs.159275 


solute carrier family, member 4 


11.08 


428728 


NMJH6625 


Hs.191381 


hypothetical protein 


11.04 


407021 


U52077 




gb^luman marinerl transposase gene, comp 


11.02 


410733 


D84284 


Hs.66052 


CD38 antigen (p45) 


11.02 


452340 


NMJJ02202 


Hs5G5 


ISL1 transcription factor, LIM/homeodoma 


10.85 


428819 


AL135623 


Hs.193914 


KIAA0575 gene product 


10.48 


421991 


NKL014918 


Hs.1 10488 


KIAA0990 protein 


10.04 


431217 


NM.013427 


HS250830 


Rho GTPase activating protein 6 


9.75 


421470 


R27496 


Hs.1378 


annexe A3 


9.64 


409262 


AK000631 


Hs52256 


hypothetical protein FU20624 


9.45 


435980 


AF274571 


Hs.129142 


deoxyribonucleasellbeta 


924 


421246 


AW582962 


Hs.102897 


CGI-47 protein 


920 


410001 


AB041036 


Hs57771 


kailikreinll 


9.03 


441791 


AW372449 


Hs.175982 


hypothetical protein FU21 159 


9.02 
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404571 








8.66 


456497 


AW967956 


Hs.123648 


ESTs, Wealdy similar to AF108460 1 ubinu 


856 


419968 


X04430 


Hs.93913 


interieukin 6 (interferon, beta 2) 


8.36 


433172 


AB037841 


Hs.102652 


hypothetical protein ASH1 


8.30 


422631 


BE218919 


Hs.118793 


hypothetical protein FU 10688 


857 


427674 


NM_0Q3528 


Hs5178 


H2B histone family, member Q 


850 


404915 








8.08 


452259 


AA317439 


Hs58707 


signal sequence receptor, gamma (translo 


8.06 


452891 


N75582 


Hs512875 


ESTs, Wealdy similar to DYH9.HUMAN CILIA 


8.02 


439731 


AI953135 


Hs.45140 


hypothetical protein FU14084 


7.98 


419839 


U24577 


Hs.93304 


phosphoBpase A2 t group VII (platelet-ac 


7.68 


420120 


ALO49610 


Hs.95243 


transcription elongation factor A (311)- 


7.64 


424099 


AF071202 


Hs.139336 


ATP-binding cassette, sub-family C (CFTR 


7.64 


448706 


AW291095 


Hs51814 


interteukin 20 receptor, alpha 


7.52 


410227 


AB00S284 


Hs.61152 


exostoses (mullip]e)-like 2 


7.49 


425211 


M18667 


Hs.1867 


progastricsin (pepsinogen C) 


7.35 


441736 


AW292779 


Hs.169799 


ESTs 


758 


419991 


AJ000098 


Hs£4210 


eyes absent (Drosophila) homotog 1 


750 


425018 


BE245277 


Hs.154196 


E4F transcription factor t 


750 


424560 


AA158727 


Hs.150555 


protein predicted by clone 23733 


7.18 


409110 


AA191493 


Hs.48778 


niban protein 


7.10 


421566 


NIO00399 


Hs.1395 


early growth response 2 (Krox-20 (Drosop 


7.Q4 


431725 


X65724 


HS2839 


Nome disease (pseudoglioma) 


6.98 


425782 


U66468 


Hs.159525 


cell growth regulatory with EF-hand doma 


6.85 


427408 


AA583206 


Hs5156 


RAR-related orphan receptor A 


6.79 


435604 


AA625279 


Hs56892 


uncharacterized bone marrow protein BM04 


6.73 


415874 


AF091622 


Hs.78893 


KIAA0244 protein 


6.54 


401451 








652 


431778 


AL080276 


Hs568562 


regulator of G-protein signalling 17 


6.51 


409089 


NMJ314781 


HS50421 


KIAA0203gene product 


6.50 


431992 


NM.002742 


H&2891 


protein kinase C, mu 


6.49 


404253 








6.42 


421552 


AF026692 


Hs.105700 


secreted frizzled-related protein 4 


6.41 


416806 


NM.000288 


Hs.79993 


peroxisomal biogenesis factor 7 


6.38 


431958 


X63629 


Hs5877 


cadherin 3, type 1, P-cadherin (placenta 


6.30 


439366 


AF100143 


Hs.6540 


fibroblast growth factor 13 


6.30 


416836 


D54745 


Hs.80247 


cholecystokinin 


6.30 


433383 


AF034837 


Hs.192731 


double-stranded RNA specific adenosine d 


659 


450728 


AW162923 


Hs55363 


preseniiin 2 (Alzheimer disease 4) 


655 


413384 


NNL000401 


Hs.75334 


exostoses (multiple) 2 


652 


423349 


AF010258 


Hs.127428 


homeoboxA9 


650 


424800 


AL035588 


Hs.153203 


MyoDfamity inhibitor 


6.18 


425451 


AF242769 


Hs.157461 


mesenchymal stem cell protein DSC54 


6.14 


447359 


NM.012093 
X91662 


Hs.18268 


adenylate kinase 5 


6.00 


410889 


Hs.66744 


twist (Drosophila) homolog (acrocephalos 


5.97 


408829 


NMJJ06Q42 


Hs.48384 


heparan sulfate (glucosamine) 3-O-suIfot 


554 


453911 


AW503857 


Hs.4007 


Sarcolemmal-associated protein 


5.94 


408875 


NM.Q15434 


Hs.48604 


OKFZP434B168 protein 


5.92 


450480 


X82125 


Hs55040 


zinc finger protein 239 


5.90 


451684 


AF216751 


Hs56813 


C0A14 


5.88 


400301 


X03635 


Hs.1657 


estrogen receptor 1 


5.78 


415077 


L41607 


Hs.934 


glucosamine (N-acetyl) transferase 2, 1 


5.74 


418852 


BE537037 


HS573294 


hypothetical protein FU20069 


5.72 


446867 


AB007891 


Hs.16349 


KIAA0431 protein 


5.72 


410232 


AW372451 


Hs.61184 


CGI-79 protein 


5.70 


422762 


AL031320 


Hs.119976 


Human DNA sequence from clone RP1-20N2 o 


5.70 


450616 


AL133067 


Hs.302689 


hypothetical protein 


5.70 


408621 


AI970672 


Hs.46638 


chromosome 11 open reading frame 8 


5.65 


439671 


AW162840 


Hs.6641 


kinesin family member 5C 


5.64 


410196 


AI936442 


Hs59838 


hypothetical protein FU10808 


5.60 


429170 


NMJJ01394 


Hs5359 


dual specificity phosphatase 4 


5.60 


440738 


AI004650 


Hs525674 


WO repeat domain 9 


5.60 


414342 


AA742181 


Hs.75912 


KIAA0257 protein 


5.59 


422634 


NNUM6010 


Hs.118821 


CGI-62 protein 


5.56 


400268 








555 


439569 


AW602166 


Hs522399 


CEGP1 protein 


5.51 


452823 


AB012124 


Hs.30696 


transcription factor-tike 5 (basic heRx 


5.48 


431938 


AA938471 


Hs.54431 


specific granule protein (28 kDa); cyste 


5.44 


427638 


AA406411 


Hs508341 


ESTs, Weakly similar to KIAA0989 protein 


542 
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421264 


AL039123 


Hs.103042 


microtubule-assodated protein 1B 


5.38 


421685 


AF189723 


Hs.106778 


ATPase, Ca++ transporting, type 2C, memb 


5.37 


421987 


AI133161 


Hs.286131 


CGM01 protein 


5.36 


422806 


BE314767 


Hs.1581 


glutathione S-transferase theta 2 


5.34 


432281 


AK001239 


Hs.274263 


hypothetical protein FU10377 


5.32 


451982 


F13036 


Hs.27373 


Homo sapiens mRNA; cDNA DKFZp56401763 (f 


5.32 


444042 


NMJXM915 


Hs.10237 


ATP-binding cassette, sub-family G (WHIT 


5.31 


447752 


M73700 


Hs.105938 


lactotransferrin 


5.29 


451418 


BE387790 


Hs.26369 


hypothetical protein RJ20287 


5.22 


428593 


AW207440 


Hs.185973 


degenerative spermatocyte (homolog Droso 


5.21 


447541 


AK000288 


Hs.18800 


hypothetical protein FU20281 


5.18 


459294 


AW977286 


Hs.17428 


RBPUike protein 


5.16 


424692 


AA429834 


Hs.151791 


KIAA0092 gene product 


5.15 


416434 


AW1 63045 


Hs.79334 


nuclear factor, interieukin 3 regulated 


5.11 


410268 


AA316181 


Hs.61635 


six transmembrane epitheftal antigen of 


5.10 


417517 


AF001176 


Ks.82238 


POP4 (processing of precursor , S. cerev 


5.10 


453616 


NM_003462 


Hs.33846 


dynein, axonemal, light intermediate poi 


5.10 


427958 


AA418000 


Hs.98280 


potassium intermediate/small conductance 


5.09 


407945 


X69208 


Hs.606 


ATPase, Cu++ transporting, alpha polypep 


5.08 


418576 


AW968159 


Hs.289104 


AJu-binding protein with zinc finger dom 


5.05 


413328 


Y15723 


Hs.75295 


guanylate cyclase 1 , soluble, alpha 3 


5.04 


432729 


AK000292 


Hs.278732 


hypothetical protein RJ20285 


5.Q4 


426342 


AF093419 


H&169378 


multiple PDZ domain protein 


5.02 


429782 


NM_005754 


Hs.220689 


Ras-GTPase-activating protein SH3-domatn 


5.02 


436209 


AW850417 


Hs.254020 


ESTs, Moderately similar to unnamed prot 


5.02 


430599 


NMJJ04855 


Hs.247118 


phosphatidyfinositoi glycan, class B 


5.00 


451386 


ABO29006 


Hs^6334 


spastic paraplegia 4 (autosomal dominant 


5.00 


457211 


AW972565 


Hs.32399 


ESTs, Weakly similar to S51797 vasodilat 


4.97 


425851 


NMJX51490 


Hs.159642 


glucosamine (N-aoetyi) transferase 1, c 


4.97 


421689 


N87820 


Hs.106826 


KIAA1696 protein 


4.93 


416533 


BE244053 


Hs.79362 


retinoblastoma-like 2 (p130) 


4.92 


432653 


N62096 


Hs.293185 


ESTs, Weakly similar to JC7328 amino aci 


4.91 


403047 








4.91 


431117 


AF003522 


Hs.250500 


delta (Drosophiia)-Iike 1 


4.90 


427617 


D42063 


Hs.199179 


RAN binding protein 2 


4.88 


428804 


AK000713 


Hs.193736 


hypothetical protein FU20706 


4.88 


449071 


NM_005872 


Hs.22960 


breast carcinoma amplified sequence 2 


4.86 


407596 


R86913 




gb.7q30f05.r1 Scares fetal liver spleen 


4.84 


456516 


BE172704 


Hs.222746 


KIAA1610 protein 


4.84 


458339 


AW976853 


Hs.172843 


ESTs 


4.83 


422083 


NM_001141 


Hs.111256 


arachktonate IS-fipoxygenase, second typ 


4.82 


449535 


W15267 


Hs.23672 


low density lipoprotein receptor-related 


4.82 


422048 


NMJJ12445 


Hs288126 


spondin 2, extracellular matrix protein 


4.82 


424602 


AK002055 


Ks.151046 


hypothetical protein FU1 1 193 


4.78 


410765 


AI694972 


Hs.66180 


nucieosome assembly protein Mike 2 


4.77 


419879 


Z17805 


Hs.93564 


Homer, neuronal immediate early gene, 2 


4.74 


450649 


NMJXM429 


Hs^5272 


E1A binding protein p300 


4.74 


411624 


BE145964 


Hs.103283 


KIAA0594 protein 


4.72 


404721 








4.70 


426261 


AW242243 


Hs.168670 


peroxisomal famesylated protein 


4.70 


416276 


U41060 


Hs.79136 


LIV-1 protein, estrogen regulated 


4.64 


408374 


AW025430 


Hs.155591 


forkheadboxFI 


4.64 


451900 


AB023199 


Hs.27207 


KIAA0982 protein 


4.63 


421437 


AW821252 


Hs.104336 


hypothetical protein 


4.63 


434629 


AA789081 


Hs.4029 


glioma-amplified sequence-41 


4.60 


403764 








458 


421247 


BE391727 


Hs.102910 


general transcription factor 1IH, polype 


433 


403721 








4.50 


453070 


AK001465 


Hs-31575 


SEC63, endoplasmic reticulum translocon 


4.49 


417412 


X16896 


Hs.82112 


interieukin 1 receptor, type I 


4.48 


439735 


AI635386 


Hs.142846 


hypothetical protein 


4.48 


430261 


AA305127 


Hs.237225 


hypothetical protein HT023 


4.46 


430598 


AK001764 


Hs.247112 


hypothetical protein FU10902 


4.44 


400303 


AA242758 


Hs.79136 


LIV-1 protein, estrogen regulated 


4.42 


438209 


AL120659 


Hs.6111 


aryWiydrocarbon receptor nuclear transl 


4.42 


417421 


AL138201 


Hs.82120 


nuclear receptor subfamily 4, group A, m 


4.40 


447270 


AC002551 


Hs.331 


general transcription factor IIIC, polyp 


4.38 


434423 


NM.006769 


Hs.3844 


LIM domain only 4 


4.35 


404561 
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422969 


AA782536 


Hs.122647 


N-myristoyitransferase 2 


4.32 


423685 


BE350494 


Hs.49753 


uveal autoantigen with coiled coO domai 


4.32 


425071 


NM_013989 


Hs.154424 


delodinase, kxfothyronlne. type li 


4.32 


431583 


AL042613 


Hs.262476 


S-adeirosylmethionine decarboxylase 1 


4.31 


442818 


AK001741 


Hs.8739 


hypothetical protein RJ10879 


4.30 


423740 


Y07701 


Hs.293007 


aminopeptidase puromycin sensitive 


4.24 


424701 


NWL005923 


Hs.151988 


mitogen-activated protein kinase kinase 


421 


424085 


NM 002914 


Hs.139226 


repDcation factor C (activator 1) 2 (40 


420 


410294 


AB014515 


Hs.323712 


KIAA0615 gene product 


4.18 


447124 


AW976438 


Hs.17428 


RBPMike protein 


4.18 


438018 


AK001160 


Hs5999 


hypothetical protein RJ10298 


4.16 


443857 


AKJ89292 


Hs.287621 


hypothetical protein FU14069 


4.15 


446711 


AF169692 


Hs.12450 


protocadherin 9 


4.15 


405403 








4.14 


448148 


NMJJ16578 


Hs.20509 


HBV pX associated protein-8 


4.13 


417531 


NMJX)3157 


Hs.1087 


serine/threonine kinase 2 


4.12 


433345 


AI681545 


Hs.152982 


hypothetical protein FU13117 


4.10 


432712 


AB016247 


Hs.288031 


sterol-CS-desaturase (fungal ERG3, delta 


4.09 


435114 


AA775483 


Hs.288936 


mitochondrial ribosomal protein L9 


4.08 


445459 


AI478629 


Hs.158465 


Gkely ortholog of mouse putative IKK re 


4.08 


402791 








4.04 


438660 


U95740 


Hs.6349 


Homo sapiens, done IMAGE 301 0666, mRNA, 


4.Q4 


447568 


AF155655 


Hs.18885 


CGM 16 protein 


4.04 


452211 


AI985513 


Hs.233420 


ESTs 


4.02 


443292 


AK000213 


Hs.9196 


hypothetical protein 


4.01 


420911 


U77413 


Hs.100293 


O-Gnked N^oetylgiuoosamkie (GlcNAc) tr 


4.00 


428738 


NM.000380 


Hs.192803 


xeroderma pigmentosum, complementation g 


3.95 


430456 


AA314998 


Hs.241503 


hypothetical protein 


3.95 


437531 


AI400752 


Hs.112259 


T ceil receptor gamma locus 


3.93 


428695 


AI355647 


Hs.189999 


purinergic receptor (family A group 5) 


3.91 


410011 


AB020641 


Hs.57856 


PFTAIRE protein kinase 1 


3.91 


446494 


AA463276 


Hs.288906 


WW Domain-Containing Gene 


3.91 


409928 


AL137163 


Hs.57549 


hypothetical protein dJ473B4 


3.90 


411598 


BE336654 


Hs.70937 


H3 histone family, member A 


3.90 


425707 


AF1 15402 


Hs.11713 


E74-like factor 5 (ets domain transcript 


3.90 


451806 


NM_003729 


Hs.27076 


RNA 3'-terrrunal phosphate cyclase 


3.89 


401045 








3.89 


437372 


AA323968 


Hs.283631 


hypothetical protein DKFZp547G183 


3.89 


417067 


AJ001417 


Hs.81086 


solute carrier family 22 (extraneuronal 


3.88 


410467 


AF102546 


Hs.63931 


dachshund (Drosophila) homokxj 


3.88 


431930 


AB035301 


Hs.272211 


cadherin7,type2 


3.88 


453047 


AW023798 


Hs286Q25 


ESTs 


3.88 


401785 








3.88 


458229 


AI929602 


Hs.177 


phosphatidylinositol glycan, class H 


3.86 


406414 








3.86 


412494 


AL133900 


Hs.792 


ADP-ribosytation factor domain protein 1 


3.84 


418329 


AW247430 


Hs*4152 


cystathionine-beta-synthase 


3.83 


424850 


AA151057 


Hs.153498 


chromosome 18 open reading frame 1 


3.82 


427585 


D31152 


Hs.179729 


collagen, type X, alpha 1 (Schmid metaph 


3.82 


423052 


M28214 


Hs.123072 


RAB3B, member RAS oncogene family 


3.82 


416111 


AA033813 


Hs.79018 


chromatin assembly factor 1 , subunit A ( 


3.62 


419423 


D26488 


Hs.90315 


KIAA0007 protein 


3.80 


429643 


AA455889 


Hs.167279 


FYVE-finger-containing Rab5 effector'pro 


3.80 


431499 


NMJJ01514 


H&258561 


general transcription factor IIB 


3.80 


444078 


BE246919 


Hs.10290 


U5 snRNP-spedfic40 kDa protein (hPrp8- 


3.78 


430291 


AV660345 


Hs.238126 


CGI-49 protein 


3.76 


431637 


AI879330 


Hs.265960 


hypothetical protein FU10563 


3.74 


440411 


N30256 


Hs.151093 


hypothetical protein DKFZp434G1415 


3.74 


405917 








3.74 


451230 


BE546208 


Hs.26090 


hypothetical protein RJ20272 


3.73 


429597 


NM_003816 


Hs.2442 


a disintegrin and metalloproteinase doma 


3.73 


415075 


L27479 


Hs.77889 


Friedreich ataxia region gene X123 


3.72 


440351 


AF030933 


Hs7179 


RAD1(S.pombe)homolog 


3.70 


443603 


BE502601 


Hs.134289 


ESTs, Weakly similar to KIAA1063 protein 


3.70 


446965 


BE242873 


Hs.16677 


WD repeat domain 15 


3.70 


412350 


AI659306 


Hs.73826 


protein tyrosine phosphatase, non-recept 


3.70 


433852 


AI378329 


Hs.126629 


ESTs 


3.70 


447397 


BE247676 


Hs.18442 


E-1 enzyme 


3.68 


405718 






3.68 
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425217 


AU076696 


Hs.1 55174 


CDC5 (eel) division cycle 5, S. pambe, h 


3.68 


421734 


A1318624 


Hs.1 07444 


Homo sapiens cDNA FU20562 (is, done KA 


3.67 


427221 


L15409 


Hs.1 74007 


von Hippel-Ltndau syndrome 


3.67 


402408 








3.66 


452946 


X95425 


Hs.31092 


EphA5 


3.66 


419078 


M93119 


Hs.89584 


insulinoma-assodatsd 1 


3.66 


427144 


X95097 


Hs.2126 


vasoactive intestinal peptide receptor 2 


3.65 


423396 


AI382555 


Hs.127950 


bromodomain-containing 1 


3.65 


446320 


AF126245 


Hs.14791 


acyl-Coenzyme A dehydrogenase family, me 


3.63 


404939 








3.62 


403137 








3.60 


437162 


AW005505 


Hs.5464 


thyroid hormone receptor coactivating pr 


3.60 


404210 








3.59 


443775 


AF291664 


HS204732 


matrix metalloproteinase 26 


3.56 


452501 


AB037791 


Hs£9716 


hypothetical protein RJ10980 


3.56 


422443 


NI\U)14707 


Hs.1 16753 


histone deacetylase 7B 


335 


420230 


AL034344 


Hs.284186 


forkheadboxCI 


3.55 


418428 


Y12490 


Hs.85092 


thyroid hormone receptor interactor 11 


3.54 


433002 


AF048730 


H&279906 


cyclin T1 


3.53 


405793 








3.52 


457940 


AL360159 


Hs.306517 


Homo sapiens TRIpartite motif protein ps 


3.52 


402444 








3.52 


418250 


U29926 


Hs.83918 


adenosine monophosphate deaminase (isofo 


3.51 


414222 


AL135173 


Hs.878 


sorbitol dehydrogenase 


3.51 


422384 


AA224077 


Hs.42438 


Sm protein F 


3.50 


447805 


AW627932 


Hs.19614 


gemin4 


3.50 


454265 


H03556 


H&300949 


ESTs, Weakly similar to thyroid hormone 


3.50 


423445 


NMJH4324 


Hs.128749 


alpha-methylacyt-CoA racemase 


3.48 


413435 


X51405 


Hs.75360 


carboxypeptidase E 


3.46 


447210 


AP035269 


Hs.17752 


phosphatidytserine-specific phospholipas 


3.46 


426931 


NM.003416 


Hs.2076 


zinc finger protein 7 (KOX 4, clone HF.1 


3.45 


408418 


AW963897 


Hs.44743 


KIAA1435 protein 


3.45 


421887 


AW161450 


Hs.109201 


CGI-86 protein 


3.44 
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Table 7: 42 GENES ENCODING SMALL MOLECULE TARGETS UP-REGULATED IN 
PROSTATE CANCER COMPARED TO NORMAL ADULT TISSUES 

5 Table 7 shows 42 genes up-regulated in prostate cancer compared to normal adult tissues that 
are likely to be small molecule targets. These were selected as for Table 5 and the predicted 
protein contained a structural domain that is indicative of a drugable structure (e.g. protease, 
kinase, phosphatase, receptor). The functional domain is indicated for each gene. 

10 Pkey: Unique Eos probeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

PSDomain: ProtBin Structural Domain 
15 R1 : Ratio of tumor vs. normal tissue 

Pkey ExAccn UnigenelD Unigene Title PSDomain R1 

20 426747 AA535210 Hs.171995 kallikrein 3, (prostate specific antigen trypsin 3130 

400299 X07730 Hs.171995 kallikrein 3, (prostate specific antigen trypsin 24.91 

420757 X78592 Hs.99915 androgen receptor (dihydrotestosterone r Androgen jecep,hormonejec,zf-C4 19.72 

408430 S79876 Hs.44926 dipeptidylpeptidase IV (CD26, adenosine DPPIV_N_term 1 Peptidase - .S9 1628 

430226 BE245562 H&2551 adrenergic, beta-2-, receptor, surface 7tm_1 15.40 

25 411096 U80034 Hs.68583 mitochondria! intermediate peptidase Peptidase^ 1431 

440286 U29589 Hs.7138 cholinergic receptor, muscarinic 3 7tm_1 12.04 

420381 D50640 Hs.337616 phosphodiesterase 3B, cGMP-inhibited PDEase 11.10 

407021 U52077 gb:Human marinerl transposase gene, comp SET.TransposaseJ 11.02 

401424 arginase 938 

30 410001 AB041036 Hs37771 kallikrein 11 trypsin 9.03 

426330 L22524 Hs.2256 matrix metaitoproteinase 7 (matrifysin, Peptidase.MIO 8.76 

424099 AF071202 Hs.139336 ATP-binding cassette, sub-family C (CFTR ABCJran,ABC_rnembrane 7.64 

419991 AJ000098 Hs.94210 eyes absent (Drosophila) homolog 1 Hydrolase 720 

431992 NMJX52742 Hs.2891 protein kinase C, mu pKnase,DAG_PE-bind,PH 6.49 

35 447359 NM_012093 Hs.18268 adenylate kinase 5 adenylatekinase 6.00 

400301 X03635 Hs.1657 estrogen nsceptor 1 OesLrecep^-C^honnonejec 5.78 

421685 AF189723 Hs.106778 ATPase, Ca++ transporting, type 2C, memb E1-E2^ATPase hydrolase 5.37 

444042 NMJXM915 Hs.10237 ATP-bfodtng cassette, sub-family G (WHIT ABCJran 5.31 

447752 M73700 Hs.105938 lactotransfenin transfemn,7tmj 529 

40 407945 X69208 Hs.606 ATPase, Cu4+ transporting, alpha polypep E1-E2^ATPase,Hydrolase,HMA 5.08 

403047 trypsin 4.91 

427617 D42063 Hs.199179 RAN binding protein 2 RaruBP1^-RanBP,TPR,pro_lsomerase 4.88 

422083 NMJ301141 Hs.1 11256 arachkionate 15-Bpoxygenase, second typ Gpoxygenase.PLAT 432 

449535 W15267 Hs.23672 low density lipoprotein receptor-related ldUecepLb,ldLrecepLa,EGF 432 

45 425071 NM_013989 Hs.1 54424 deiodinase. lodothyronine, type II T4_defodinase 432 

423740 Y07701 Hs293007 aminopeptidase puromycln sensitive Peptidase_M1 424 

424701 NM.005923 Hs.151988 mitogen-activated protein kinase kinase pkinase 4.21 

424085 NM_002914 Hs.139226 repfication factor C (activator 1)2 (40 AAA,ViraLbelicase1 450 

417531 NMJJ03157 Hs.1087 serine/threonine kinase 2 pkinase* 4.12 

50 428695 AI355647 Hs.1 89999 purinergic receptor (family A group 5) 7tm_1 331 

410011 AB020641 Hs37856 PFTAIRE protein kinase 1 pkinase 3.91 

424850 AA151057 Hs.1 53498 chromosome 18 open reading frame 1 IdUecepLa 3,82 

412350 AI659306 Hs.73826 protein tyrosine phosphatase, non-recept Yj)hosphatase,Band_41,PDZ 3.70 

447397 BE247676 Hs.18442 E-1 enzyme Hydrolase 3.68 

55 452946 X95425 Hs.31092 EphAS EPHJbd,fn3,pkinase,SAM 3.66 

427144 X95097 Hs.2126 vasoactive intestinal peptide receptor 2 7tm_2 3.65 

443775 AF291664 Hs.204732 matrix metaitoproteinase 26 Peptidase_M10 336 

457940 AL360159 Hs306517 Homo sapiens TRIpartHe motif protein ps SPRY,7tm_1 332 

418250 U29926 Hs.83918 adenosine monophosphate deaminase (isofo Ajdeaminase 331 

60 413435 X51405 Hs.75360 carboxypeptidase E ZruarbOpept 3.46 

447210 AF035269 Hs.17752 phosphatldylserme^peciflcphospholipas lipase 3.46 
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TABLE 8: 136 GENES SIGNIFICANTLY DOWN-REGULATED IN PROSTATE 
CANCER COMPARED TO NORMAL PROSTATE 

Table 8 shows 136 genes significantly down-regulated in prostate cancer compared to nonnal 
5 prostate . These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 2. The "average" nonnal prostate level was set to the mean amongst 4 



amongst 73 tumor samples. In order to remove gene-specific background levels of non- 
10 specific hybridization, the 10 th percentile value amongst all the tissues was subtracted from 
both the numerator and the denominator before the ratio was evaluated. 



ExAccn: Exemplar Accession number, Gertbank accession number 

15 UnigenelD: Unigene number 

Unlgene Title: Unigene gene title 

HI: Ratio of normal prostate to prostate cancer 



Pkey ExAccn UnigenelD Unigene Title R1 

425932 M81650 Hs.1968 semenogelinl 57.69 

425545 N98529 Hs.158295 Human mRNA for myosin light chain 3 (MLC 19.70 

426752 X69490 Hs.172004 titin 1555 

442082 R41823 Hs.7413 ESTs; calsyntenin-2 10.05 

25 407245 X90568 Hs.172004 titin 958 

422711 D60641 Hs^1739 Homo sapiens mRNA; cDNADKFZp586l 151 8 (f 9.05 

420813 X51501 Hs.99949 prolactin-induced protein 8.18 

411987 AA375975 Hs.183380 'ESTs, Moderately similar to ALU7_HUMAN 7.45 

404567 5.62 

30 416030 H15261 Hs.21948 ESTs 551 

444892 AI620617 Hs.148565 ESTs 527 

444573 AW043590 Hs^25023 ESTs 520 

428068 AW016437 Hs.233462 ESTs 5.08 

437440 AA846804 Hs.123694 ESTs 455 

35 404113 4.75 

452279 AA286844 Hs.61260 hypothetical protein FU1 31 64 4.75 

421058 AW297967 Hs.188181 ESTs 4.63 

445592 AV654382 Hs.17947 'ESTs, Weakly similar to KQ2F3.10 (Cele 453 

405163 4.49 

40 405227 4.45 

454059 NM 003154HS-37048 statherin 445 

450152 AI138635 Hs22968 ESTs 4.40 

407013 U35637 "gb:Human nebufin mRNA, partial cds" 4.03 

403612 4.02 

45 440089 AA864468 Hs.135646 ESTs 4.00 

408988 AL1 19844 Hs.49476 Homo sapiens clone TUA8 CrWu-chat regi 3.98 

436726 AA324975 Hs.128993 "ESTs, Weakly similar to KIAA0465 protei 3.95 

459367 BE148877 •gb.^M44fT0244-111199^40-h12HT0244Hom 3.95 

427318 AF186081 Hs.175783 zinc transporter 3.92 

50 411762 AW860972 "gbOV(K5T0387-180300-167-h07CT0387Hom 355 

418668 AW407987 Hs57150 Human clone A9A2BR11 (CAC)n/(GTG)n repea 3.75 

458311 AF069478 "gb:AF069478 Homo sapiens astrocytoma ii 3.61 

403649 3.60 

419682 H13139 Hs.92282 paired-like homeodomain transcription fa 358 

55 412519 AA196241 Hs.73980 troponin T1, skeletal, slow* 351 

414206 AW276887 Hs.46609 ESTs 3.45 

427419 NM 000200Hs.1 77888 histatin3 3.37 

420777 AA280223 Hs.130865 ESTs 3.35 

428134 AA421773 Hs.161008 ESTs 351 

60 45Q218 R02018 Hs.168640 *Ank, mouse, homolog of 330 

433474 AI192195 Hs.147174 "EST, Highly similar to ubiquitm-protei 3.30 

418833 AW974899 Hs.292776 ESTs 356 

400440 X83957 Hs.83870 nebulin 3.16 
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413778 AAD90235 Hs.75535 •myosin, light polypeptide 2, regulatory 3.06 

423151 AW838068 l gb:QV3-LT0048-01 0300-109-102 LT0048 Horn 3.05 

445060 AA830811 Hs.88808 ESTs 2.98 

457065 A1476318 Hs.192480 ESTs 2.95 

5 432456 H00093 "gb:ph8f12uJ9flTV Outward Alu-primed hn 2.92 

405678 2.85 

406707 S73840 Hs.931 "myosin, heavy polypeptide 2, skeletal m 2.81 

444105 AW1 89097 Hs.166597 ESTs 2.78 

433968 AL157518 Hs.90421 PR02463 protein 2.73 

10 438522 AAB09431 Hs258886 ESTs 2.73 

436562 H71937 Hs.169756 "complement component 1,s subcomponent" 2.68 

412417 AA102268 Hs.42175 ESTs 2.67 

455590 BE072259 "gb.-QV4-BT0536-271299-05^g04 BT0536 Horn 2.65 

415380 F07953 Hs.16085 putative G-proiein coupled receptor 2.65 

15 428729 AL162331 Hs.191436 hypothetical protein FU1 061 9 2.64 

408537 AW207734 "gbAJI-H-BI2-age-h-01-(HJLs1 NCI_CGAP_S 2.63 

424706 AA741336 Hs.152108 transcriptional unit N143 2.63 

413212 BE072092 WM4^T0532-1^(XHW34)11 BT0532Hom 2.63 

406704 M21665 Hs.929 "myosin, heavy polypeptide 7, cardiac mu 2.62 

20 437507 AA758538 Hs246882 ESTs 2.60 

410384 AI933794 Hs.42745 ESTs 258 

408074 R20723 Hs.124764 ESTs 258 

436653 AA829828 Hs2924Q2 ESTs 252 

458090 AI282149 Hs56213 "ESTs, Highly similar to FXD3_HUMAN FORK 251 

25 432003 AI689154 Hs.122972 ESTs 250 

436915 AA737400 Hs.142230 ESTs 250 

410028 AW576454 Hs258553 ESTs 2.46 

448920 AW408009 Hs22580 alkyrlglycerone phosphate synthase 2.45 

422046 A1638562 "gb:ts50a10.x1 NCLCGAPJJH Homo sapiens 2.44 

30 451122 AA015767 Hs.193587 ESTs 2.40 

422646 H87863 Hs.151380 ESTs 2.36 

451237 AW600293 •gb:EST00049 pGEM-T library Homo sapiens 2.36 

400001 AFFX control: Bk)B-3 2.36 

415835 245365 "gb:HSC2NF061 normalized Infant brain cD 2.36 

35 439706 AW872527 Hs59761 ESTs 2.36 

423341 AW242394 Hs252495 ESTs 2.36 

436486 AA742221 Hs.120633 ESTs 2.35 

407449 AJ002784 gbHomo sapiens mRNA; fetal brain cDNA 5 2,33 

430573 AA744550 Hs.136345 ESTs 2.32 

40 401974 2.31 

443356 AL044498 Hs.133262 "ESTs, Weakly simflax to PH0217 reverse 2.31 

430751 NMJ012471HS247868 transient receptor potential channel 5 225 

439128 AI949371 Hs.153089 ESTs 225 

448765 R15337 Hs21958 "Homo sapiens cDNA FU10532 fis, clone N 225 

45 451130 AI762250 Hs211347 ESTs 224 

405420 223 

455029 AW851258 "r£:IL3^TG220-160200^6^06 CT0220 Horn 223 

438224 AA933999 "gb:on91f04.s1 Soares_NFU_T_GBC„S1 Homo 223 

407764 BE008347 "gbX)MO-BN0154^80400-3254»04BN0154Hom 223 

50 413549 BE252470 B gb:6011Q8292F1 NIH_MGC_16 Homo sapiens 223 

437010 AA741368 HS291434 ESTs 223 

435111 AI914279 Hs213740 ESTs 222 

403375 221 

455060 AW853441 "gb:RC1-CT0252-03010(Hl23^09CT0252Hom 221 

55 409792 AWB54153 "gb:RC3^T(e5^06040O029-d03 CT0254 Horn 2.20 

421154 AA284333 Hs287631 "Homo sapiens cDNA FU14269 fis, clone P 2.19 

401963 2.18 

435034 AF168711 Hs.159397 xOIOproteb 2.18 

448996 AW998989 Hs.105749 KIAA0553 protein 2.18 

60 436816 AW297599 Hs255667 ESTs 2.17 

442252 AI733395 Hs.129124 ESTs 2.17 

419310 AA236233 Hs.188716 ESTs 2.16 

418579 H91800 Hs.124156 ESTs 2.16 

423315 R54109 Hs26096 ESTs 2.16 

65 432744 AA988835 Hs.38664 ESTs 2.15 

424492 AI133482 Hs.165210 ESTs 2.15 

424770 AA425562 "gbzw46eQ5.r1 Scares JotaLfetus_Nb2HF8 2.15 

437101 AA744518 Hs.120610 ESTs 2.15 

428793 AC004957 Hs298975 'ESTs, Highly similar to collapsln-2-lik 2.15 
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415708 H56475 'gb:yt87d11.r1 Soaresj>ineaLgland__N3HPG 2.13 

459619 2.12 

427506 AK000134 Hs.179100 hypothetical protein FU201 27 2.12 

452508 AA804174 Hs.1 84354 ESTs 2.10 

410881 AW809157 "gb:RC0-ST011 8-041099-031 -c07J ST0118 Homo sapiens cDNA, mRNA sequence - 2.10 

403087 2.10 

403869 2.10 

445028 D81194 Hs.282499 ESTs 2.10 

447884 H29505 B gb:ym60cl10.r1 Soares infant brain 1N1B Homo sapiens cONA clone 5*, mRNA sequence" 2.10 

414575 H11257 Hs.295233 ESTs 2.09 

420351 BE218221 Hs.190044 ESTs 2.08 

426998 BE274360 "gb:601 121 068F1 NIHJMGC JO Homo sapiens cDNA clone 5', mRNA sequence' 2.08 

405455 2.08 

423843 AA332652 "gb:EST36627 Embryo, 8 week I Homo sapiens cDNA 5* end similar to similar to 

monoamine oxidase B, mRNA sequence" 2.08 

406135 2.07 

427046 BE246180 Hs.121385 ESTs 2.07 

403493 2.05 

444514 AI682905 Hs.270431 "ESTs, Weakly similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE 

CONTAMINATION WARNING ENTRY [H-sapiensf 2.05 

435884 AA701443 Hs.192868 ESTs 2,05 

419629 AB020695 Hs.91662 KIAA0888 protein . 2.03 

405900 ~ 2.03 

457350 AW974438 Hs.194136 'ESTs, Moderately similar to AF091457 1 zinc finger protein RIN ZF [R/iorveglcusr 2.02 

400007 AFFX control: BioDn-5 2.01 

406978 M64358 "gb:Human rhom-3 gene, exon" 2.00 
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TABLE 8A shows the accession numbers for those primekeys lacking a unigenelD in Table 
8. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



10 Pkey: 

CAT number 



Unique Eos probeset identifier number 
Gene cluster number 
Genbank accession numbers 



IS Pkey CAT number Accessions 



407764 1014849J BE008347 BE008320 BE083307 BE083311 AW075968 

408537 1064753 J AW207734 D60164 D81150D81078 D61356 AW996804 

409792 1154677J AW854153 AW500210BE145772AW501310 

20 410881 1225682 1 AW809157 AW812181 AW812175 AW812172AW812161 AW812165 

41 1762 1 256906 J AW860972 AW862598 AW862599 AW860988 AW860983 AW860898 AW860925 AW860922 AW860986 AW860984 AW860989 

413212 1353792J BE072092BE072106BE072086BE072098BE072103 

413549 1375933.2 BE252470 BE147573 

415708 1548209 J H56475 F29401 F34552 

25 415835 1558511J 245365 R25905 H05203 T77496 

422046 210744J AI638562 T16929 H13401 F07773 R55836 

423151 225415J AW838068 AW837986 AW838067 AA322487 AW837936 

423843 23251 0_1 AA332652 AA331633 AW999369 AW9Q2993 BE170475 AA378845 AW964175 AI475221 

424770 243504J AA425562AI880208AA346646N22655 AW81 1775 AW81 1786 

30 426998 274259_-1 BE274360 

432456 347718.2 H00093 H00079 H00070 H00054 H00049 H00063 AW905306 AW905241 AW905410 AW905307 AW90541 1 AW905240 
AW905210 

AW905352 AW905304 AW905239 AW905242 AW905243 H00087 

438224 452656J AA933999 M781181 

35 447884 740749 J H29505 R18575 Z43580 T48738 A1435454 BE004683 

451237 863269J AW600293 AI767468 

455029 1249374J AW851258 AW851435AW851106 AW851421 

455060 1251259J AW853441 BE145228 BE145218BE145162BE145283 

455590 1335127J BE072259BE072230BE007911 

40 458311 543550J AF069478 AF069479 AF069480 
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TABLE 8B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in table 8. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Ptey. Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Ql) numbers. TDunham L et aL" refers to the 

publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et aL, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which axons were predicted. 

Imposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposffion 


401963 


3126783 


Pius 


51382-51521 


401974 


3126777 


Plus 


8533085683 


403087 


8954241 


Plus 


16951 M69795 


403375 


9255944 


Minus 


92554-92795 


403493 


7341425 


Phis 


157568-159084 


403612 


8469060 


Minus 


94723-94859 


403649 


8705159 


Minus 


27141-27247 


403869 


7280046 


Minus 


34379-34583 


404113 


9588571 


Minus 


13446-13646 


404567 


7249169 


Minus 


101320-101501 


405163 


9966267 


Minus 


161171-161299 


405227 


6731245 


Minus 


22550-22802 


405420 


7211837 


Minus 


13428-13582 


405455 


7656675 


Plus 


134112-134671 


405678 


4079670 


Pius 


151821-152027 


405900 


6758795 


Minus 


71181-71535 


406135 


9164918 


Minus 


65489-65715 
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TABLE 9: 1001 GENES SIGNIFICANTLY UP-REGULATED IN NORMAL PROSTATE 
COMPATED TO PROSTATE CANCER 

Table 9 shows 1001 genes significantly up-regulated in prostate cancer compared to normal 
prostate. These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 8.14. The "average" normal prostate level was set to the mean 
amongst 4 normal prostate tissues. The "average" prostate cancer level was set to the 85 th 
percentile amongst 73 tumor samples. In order to remove gene-specific background levels of 
non-specific hybridization, the 10 th percentile value amongst all the tissues was subtracted 
from both the numerator and the denominator before the ratio was evaluated. 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey: 


Unique Eos probeset identifier number 




ExAccn: 


exemplar MCCuoSioti nurnuer, uenoanK accession numDor 




UnigenelD: 


Unigene number 




umgene 1 itie. 


unigene gene Due 




Hi: 


nauo oi p roseate cancer to normal prosiaie 




PKey txAccn 


unicjeneus unigene nue 


R1 
n i 


451UUZ AAUIoZot 


He flMB FQTe WnaMi/ cimitar tn All H HI IMAM AI 1 1 


158400 


435590 AAdoSMoo 


Uc 1CQQQQ CQTc 

ns, loo y yy cots 


718 00 


AAQK7R AtnTQH07 
44O0/0 AIU/OUc/ 


He IfiCmfl F<5Tc 


246.86 


A0A0A7 AAQ9A11R 


He979flA5 CCTe 
nox f cXJOO CO 1 o 


245.20 


AtY\AQO A.tffWI1fl£ 

4UU4oJ AKUUUloo 


nh'Unmn coruano «nMfl CI I0A170 fie Wnno 

yD.nomo sapiens cuiw ru^u i / o lis, auiiu 








221.33 


4il/yU0 SVUJ04OJU 


He 1fiR59n PCTc 


212.00 


*WoOOO MIOODOOU 


no. i f *rHo i co i o 


163.20 




Hs 193237 ESTs 


149.45 




Hc11fi9 malnr hict/Vfimnafthflrh/ ramnlPY nlnec 


126.11 


429430 M 36860 


Hs 9295 etastin fsuDravalviilar aortic stenosis. 


12327 


426025 AW138330 


Hs.233778 ESTs 


120.00 


418917 X02994 


Hs.1217 adenosine deaminase 


106.75 


404407 




10571 


442027 AI652926 


Hs.128395 ESTs 


10053 


433704 AA608684 


Hs.121705 ESTs, Moderately similar to ALUCJHUMAN I 


94.00 


453758 U83527 


gb:HSU83527 Human fetal brain (M.LovetQ 


89.18 


415354 F06495 


gb:HSC1AB051 normalized infant brain cON 


87.73 


424239 M67439 


Hs.143526 dopamine receptor D5 


86.82 


444143 AW747996 


Hs.160999 ESTs 


86.43 


401672 




7726 


430590 AW383947 


Hs246381 C068 antigen 


68.47 


411972 BE074959 


gb*»Mf>BT0582-3101(XM)01-t08 BT0582 Homo 


68.00 


448992 AI766053 


Hs.188346 ESTs 


6126 


408828 BE540279 


gb«01059857F1 NIH_MGCJ0 Homo sapiens c 


57.71 


409653 AW451693 


Hs.220826 ESTs 


' 56.40 


402964 




54.67 


422673 N59027 


gb:yv59d1 1 .rl Scares fetal liver spleen 


54.00 


422568 AA372275 


Hs.279800 Homo sapiens cDNARJ1 1383 fts, done HE 


54.00 


438907 R32704 


Hs.301298 ESTs 


52.96 


405172 




52.96 


444897 AW137088 


Hs.144857 ESTs 


52,32 


458019 AW592931 


Hs.256298 ESTs 


51.63 


405275 AB028989 


Hs.88500 mitogen-activated protein kinase 8 inter 


50.98 


457815 AA703679 


Hs.106999 ESTs, Weakly similar to SYT5_HUMAN SYNAP 


49.60 


424385 AA339666 


gb:EST44776 Fetal brain I Homo sapiens c 


48.90 


407172 T54095 


gb:ya92c05,s1 Stratagene placenta (93722 


47.98 


428202 AA424163 


Hs.156895 ESTs 


46.83 


435672 AI700148 


Hs.283626 ESTs 


4357 


420283 AA485224 


Hs.57734 G protein-coupled receptor kinase-intera 


43.00 


417016 AA837098 


Hs269933 ESTs 


42.70 


438854 AF074994 


HS24240 ESTs 


4257 
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406134 42.43 

457319 AA480895 Hs201552 ESTs, Weakly similar to T17288 hypotheti 42.31 

409314 AA070266 Qb.-zm69d04.r1 Stratagene neuroepitheOum 4255 

401124 41.61 

5 429316 AI371157 Hs.178538 ESTs 40.00 

420317 AB006628 Hs.96485 KIAAQ290 protein 39.64 

457566 AW062439 gb:MRf>CT0060-120899«001-f08 CT0060 Homo 39.60 

417407 AA923278 Hs2909Q5 ESTs, Weakly similar to protease [H.sapi 38.73 

430269 BE221682 Hs.178364 ESTs 38.06 

10 439602 W79114 Hs5855B ESTs 36.69 

433686 AA604799 Hs.136528 ESTs, Moderately similar to ALU1_HUMAN A 3629 

417993 AW963705 Hs295806 ESTs, Weakly similar to ALU7_HUMAN ALU S 36.18 

428214 AA936282 Hs.120397 ESTs 36.10 

416908 AA333990 Hs.80424 coagulation factor XIII, A1 polypeptide 36.08 

15 426264 BE314852 Hs.1 68694 hypothetical protein FU 10257 36.00 

415911 H08796 Hs.124952 ESTs 36.00 

457502 AA076049 Hs274415 Homo sapiens cONA FU10229 fis, clone HE 3523 

421566 NM.000399 Hs.1395 early growth response 2 (Krox-20 (Drosop 3520 

401468 3459 

20 458561 AI220150 Hs21 1195 ESTs 34.60 

433601 BE350738 Hs.1 23993 ESTs, Weakly similar to T00366 hypotheti 3324 

454977 AW848032 gfcJl3-CT<E14-231299-053O11 CT0214Homo 32.96 

402828 32.93 

414522 AW518944 Hs.76325 Homo sapiens cDNA: FU23125 fis, clone L 31.76 

25 402842 3158 

421245 AA285363 p#HTH280 HTCDL1 Homo sapiens cDN A 573 3159 

401631 F05183 Hs.1799 CD1D antigen, d polypeptide 3126 

408057 AW139565 gb:UI-H-BI1*ea-d-04-0-Ul.s1 NCI_CGAP_Su 3124 

408069 H81795 gb:ys68a10j1 Scares retina N2b4HR Homo 3120 

30 438694 T87479 Hs291797 ESTs 31.09 

449156 AF103907 Hs.1 71 353 prostate cancer antigen 3 29.78 

428796 AU076734 Hs.1 93665 solute carrier famSy 28 (sodium-coupled 29.76 

452549 AI907039 gb:PM-BT1 34-020499-566 BT134 Homo sapien 29.59 

410129 BE244074 Hs285531 regulator of Fas-induced apoptosis 2953 

35 414464 AI870175 Hs.13957 ESTs 29.47 

412326 R07566 Hs.73817 Smafl inducible cytokine A3 (homologous 2922 

459081 W07808 gb2bQ3a12.r1 Soares_fetal_lung_NbHL19W 2920 

448702 AW102670 Hs,122464 ESTs 29.13 

451939 U80456 Hs27311 single-minded (Drosophila) homolog 2 28.74 

40 443412 W84893 Hs.9305 angiotensin receptor-like 1 28.61 

457324 AB028990 Hs243901 KIAA1067 protein 2824 

424247 X14008 Hs234734 lysozyme (renal amyloidosis) 28.18 

457140 AI279960 Hs.178140 ESTs 28.12 

444151 AW972917 Hs.128749 alpha-methylacyK)oA racemase 28.06 

45 457669 AW104257 Hs.123426 ESTs, Weakly similar to putative serine/ 27.61 

412429 AV650262 Hs.75765 GR02 oncogene 27.36 

405495 27.33 

406516 2725 

407997 AW135429 Hs243577 ESTs 2856 

50 442115 AW452332 Hs257554 ESTs 26.36 

409038 T97490 Hs50002 small inducible cytokine subfamily A (Cy 2654 

402838 26.32 

449846 AI979284 Hs200552 ESTs * 2621 

417153 X57010 Hs.81343 collagen, type II, alpha 1 (primary oste 2620 

55 439792 NMJH4856 Hs.6684 KIAA0476 gene product 25.91 

450096 A1682088 Hs223368 ESTs 25.60 

424196 AL133660 Hs.142926 Homo sapiens mRNA; cONA DKFZp434M0927 (f 2557 

414246 BE391090 Hs280278 EST 2557 

420848 NM_0Q5t88 Hs.99980 Cas-Br-M (murine) ecotropic retroviral t 25.48 

60 424778 AA251048 Hs.153042 lymphocyte antigen 9 25.42 

409126 AA063426 gbzf70c08.s1 Soaresj>ineaUtandJI3HPG 2525 

443936 AW083491 Hs.31196 ESTs 2522 

419392 W28573 gb51f10 Human retina cDNA randomly prim 25.01 

411201 T74588 Hs.8509 ESTs, Weakly similar to C03_HUMAN COMPLE 2455 

65 422940 BE077458 gb:RC1-BT0606-09050(K>15-b04 BT0606 Homo 24.76 

437571 AA760894 Hs.153023 ESTs 24.74 

433973 AI014723 Hs.131770 ESTs 2457 

422416 BE019557 Hs.1 1900 Human DNA sequence from clone RP4-583P15 2453 

421552 AF026692 Hs.105700 secreted frizzed-related protein 4 24.49 
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443668 U25758 Hs.134584 ESTs 24.49 

424800 AL035588 Hs.153203 MyoD family inhibitor 24.10 

453633 AA357001 Hs.34045 hypothetical protein RJ20764 24.04 

430565 AL122081 Hs244343 cadherin related 23 24.00 

5 433694 AI208611 Hs.12066 Homo sapiens cDNA FU1 1720 fis, done HE 23.89 

451045 AA215672 gb:zr96e09.s1 NCI_CGAP_GCB1 Homo sapiens 23.83 

408583 AW449674 Hs.47359 ESTs 23.73 

444040 AF204231 Hs.182982 golgin-67 23.62 

414182 AA136301 gbzk93g04.s1 Soaresj>regnantuterus_NbH 23.39 

10 418678 NMJ001327 Hs.167379 cancer/testis antigen 2320 

408380 AF123050 Hs.44532 diublquitin 2258 

456076 BE243877 Hs.76941 ATPase, Na-dK+ transporting, beta 3 poly 22.65 

418299 AA279530 Hs.83968 integrin, beta 2 (antigen CD16 (p95), ly 223 

444917 R68651 Hs.144997 ESTs 2226 

15 444381 BE387335 HS283713 ESTs 22.08 

415788 AW628686 Hs.78851 KIAA0217 protein 22.04 

410896 AW809637 gb:MR4-ST0124-261099-015-b07 ST0124 Homo 22.00 

412978 AI431708 Hs.820 homeoboxC6 21.95 

458418 AV653846 Hs.126261 Homo sapiens Chromosome 16 BAC clone CIT 21.94 

20 454791 BE071874 gb:R(2-BT0522-120200-014-a06 BT0522 Homo 21.84 

408748 J05500 Hs.47431 spectrin, beta, erythrocytic (includes s 2126 

416011 H14487 gfcym18c10.r1 Scares infant brain 1 NIB H 2124 

440474 AI207936 Hs.7195 gamma-amfnobutyric acid (GABA) A recepto 21.14 

447047 A1623698 Hs.246306 Homo sapiens cDNA RJ23529 fis, clone L 21.11 

25 426793 X89887 Hs.172350 HIR (histone ceil cycle regulation defec 21.10 

409841 AW502139 gb:Umf^BR0fhajr-e^)5^HJl/1 NIH_MGC_5 21.07 

405685 2050 

457359 AI983207 Hs.192481 ESTs, Weakly similar to SYPH_HUMAN SYNAP 20.84 

423067 AA321355 Hs285401 ESTs 20.74 

30 422355 AW403724 Hs.140 immunoglobulin heavy constant gamma 3 (G 20.73 

401201 20.73 

458278 W28912 Hs.129019 ESTs 20.68 

439097 H66948 gb:yr86d10.rl Soares fetal liver spleen 20.67 

414875 H42679 Hs77522 major histocompatibility complex, class 20.66 

35 400926 20.66 

451355 NMJJ04197 Hs.444 serine/threonine kinase 19 20.64 

446982 AW500221 Hs.43616 Homo sapiens mRNA for FU00029 protein, 20.61 

417105 X60992 Hs.81226 CD6 antigen 20.61 

405777 20-51 

40 424123 AW966158 Hs58582 Homo sapiens cDNA FLI12702 fis, clone NT 2020 

425009 X58288 Hs.154151 protein tyrosine phosphatase, receptor t 20.10 

443271 BE568568 Hs.195704 ESTs 19.98 

421064 AI245432 Hs.101382 tumor necrosis factor, alpha-induced pro 19.98 

418819 AA228776 Hs.191721 ESTs 19.94 

45 457595 AA584854 gb:no09h11.s1 NCIjCGAP.Phel Homo sapiens 1950 

404426 19 - M 

412571 U43143 Hs.74049 fms-related tyrosine kinase 4 19.79 

431457 NM_012211 Hs256297 integrin, alpha 11 19.62 

414002 NM.006732 Hs.75678 FBJ murine osteosarcoma viral oncogene h 1957 

50 418994 AA296520 Hs.89546 Selectin E (endothelial adhesion molecul 1956 

437158 AW090198 Hs.4779 KIAA1 150 protein 1952 

437866 AA156781 Hs.83992 ESTs 19.44 

417421 AL138201 Hs.82120 nuclear receptor subfamily 4, group A, m - 1954 

433057 X15675 Hs296832 Human pTR7 mRNA for repetitive sequence 1922 

55 421730 AW449808 Hs.164036 glucosamine (N-acetyl)-6-su1fatase (Sanf 1921 

456557 AA284477 Hs.96618 ESTs 18.77 

440806 AI247422 Hs.129966 ESTs 18.76 

439845 AL355743 Hs56663 Homo sapiens EST from clone 41214, full 18.65 

416155 AI807264 Hs205442 ESTs, Weakly simitar to AF11 7610 1 inner 18.64 

60 437820 AA769062 Hs.16029 ESTs, Weakly similar to alternatively sp 18.62 

450923 AW043951 Hs.38449 ESTs 1859 

418329 AW247430 Hs.84152 cystathionine-beta-synthase 1858 

424537 AI673027 Hs.143271 ESTs 1855 

447742 AF113925 Hs.19405 caspase recruitment domain 4 1852 

65 415251 R42863 Hs.7124 ESTs 18.47 

440770 AA912815 Hs222078 ESTs 18.40 

407711 AI085846 Hs25522 ESTs 18.32 

427157 U51166 Hs.173824 thymina-ONA glycosylase 1828 

409847 AW501751 Hs279733 ESTs 18.15 
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417240 N57568 Hs.176028 EST 18.13 

435732 AF229178 Hs.123136 leucine rich repeat and death domain con 18.12 

436896 AW977385 Hs278615 ESTs 18.12 

432485 N90866 Hs276770 CDW52 antigen (CAMPATH-1 antigen) 17S0 

5 429490 AI971131 Hs293684 ESTs, Weakly similar to alternatively sp 17.82 

429984 AL050102 Hs227209 DKFZP586F1 01 9 protein 1732 

449214 AI889114 Hs.195663 ESTs 17.75 

433867 AK000596 Hs.3618 hlppocalcm-tike 1 17.72 

431735 AW977724 Hs.75968 thymosin, beta 4, X chromosome 17.71 

10 401515 17.67 

444045 AI097439 Hs.135548 ESTs 1758 

442754 AL045825 Hs210197 ESTs 1755 

426559 AB001914 Hs.170414 paired basic amino add cleaving system 1754 

432415 T16971 Hs289014 ESTs 1750 

15 427829 AI188225 Hs.127482 ESTs 1750 

432516 R08003 Hs.188013 ESTs 1744 

435259 AA152106 Hs.4859 cyclin L ania-6a 1736 

414989 T81668 gb.-yd29c04.r1 Soares fetal river spleen 1731 

444880 AW1 18683 Hs.154150 ESTs 1730 

20 417651 R06874 Hs268628 ESTs 1727 

453457 AL0371O3 Hs270599 ESTs, Weakly similar to unnamed protein 1722 

424246 AW452533 Hs.143604 Kaiso 1722 

419078 M93119 Hs.89584 insulinoma-associated 1 17.18 

417696 BE241624 Hs.82401 CD69 antigen (p60, early T-cetl activati 17.14 

25 431117 AF003522 Hs250500 delta (DrosophHa>«ke 1 17.14 

455254 AW877015 gb.OV2-PT0010-25030(M)96-f12 PT0010 Homo 17.14 

425782 U66468 Hs.159525 cell growth regulatory with EF-hand doma 17.12 

426678 H08170 Hs.1 13755 ESTs 17.12 

426403 NM.000361 Hs2Q30 thrombomodulin 17.01 

30 . 425905 AB032959 Hs.161700 K1AA1 133 protein 17.00 

438867 AW451157 Hs.181157 ESTs 16.98 

420940 AA830664 Hs.143974 ESTs 1634 

459234 A1940425 gb:CM&CT0052-150799-024-c04 CT0052 Homo 1632 

404756 1691 

35 422247 U18244 Hs/1 13602 solute carrier family 1 (high affinity a 16.90 

420568 F09247 Hs.167399 protocadherinalphaS 1638 

443559 AI076765 HS269899 ESTs 1630 

438703 AI803373 Hs31599 ESTs 16.78 

411424 AW845985 gb:RC2-CT0163-200999«002-H08CT0163Homo 1670 

40 402895 1639 

422538 NMJ006441 Hs.118131 S.IOHTielhenyltetrahydrofolatB synthetase 16.68 

447108 AW449602 Hs217953 ESTs, Moderately similar to NK-TUMOR REC 16.65 

448520 AB002367 H&21355 doubtecortin and CaM kinase-like 1 1654 

438567 AW451955 Hs.153065 ESTs 1652 

45 407811 AW190902 Hs.40098 cysteine knot superfamiiy1.BMP antagon 1650 

410721 R23534 Hs2730 heterogeneous nuclear ribonudeoprotein 1650 

437133 AB018319 Hs5460 KIAA0776 protein 16.40 

408182 AA047854 gbzf49g04.rl Soares retina N2b4HR Homo 1632 

417315 AI080042 Hs.180450 ribosomal protein S24 1630 

50 431840 AA534908 Hs2860 POU domain, dass 5, transcription facto 1628 

439882 AA847856 Hs.124565 ESTs 1620 

418277 AW135221 Hs.130812 ESTs 1639 
410688 AW796342 p^:PM2-UM()027-23020(HX)2-h02 UM0027 Homo ' 16.04 

420120 AL049610 Hs.95243 transcription elongation fador A {S\\y 16.04 

55 429597 NM_003816 Hs2442 a disintegrin and metalloproteinase doma 16.02 

447033 AI357412 Hs.157601 EST -not In UnlGene 16.02 

421684 BE281591 Hs.106768 hypothetical protein RJ10511 15.94 

408599 AA055800 Hs222933 ESTs 15.93 

446012 AV656098 Hs.1 72382 hypothetical protein FLJ20001 15.86 

60 409671 AA076769 gb:7B02B10 Chromosome 7 Fetal Brain cDNA 15.85 

405934 . 15.84 

426108 AA622037 Hs.166468 programmed ceil death 5 15.84 

416208 AW291168 Hs.41295 ESTs 15.48 

410708 AA534370 Hs.154088 Homo sapiens cDNA: FU22756 fis, done K 15.42 

65 447342 AI199268 Hs.19322 ESTs; Weakly similar to 111! ALU SUBFAMl 1538 

454563 AW807530 gb^)MO-ST0081 -130999-054^102 ST0081 Homo 1537 

411507 AW850140 gb:IL3-CT021 9-261099-023-01 1 CT0219 Homo 1536 

438170 AI916685 Hs.194601 ESTs 1529 

416292 AA179233 Hs.42390 nasopharyngeal cardnoma susceptibility 1526 
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406638 


M13861 


446686 


AW138043 


434485 


AI623511 


441188 


AW292830 


444172 


BE147740 


409521 


BE244854 


420748 


AA279956 


422583 


AM10506 


424240 


AB023185 


451118 


A1862096 


437495 


BE177778 


445467 


AI239832 


418305 


AW006783 


402812 




436851 


AA732480 


400991 




415752 


BE314524 


429900 


AA460421 


403683 




430315 


NM.004293 


451952 


AL120173 


424687 


J05070 


447229 


BE617135 


425818 


AB021225 


448553 


AI638449 


431089 


BE041395 


459145 


AI903354 


449650 


AF055575 


400952 




445885 


AI734009 


407938 


AA905097 


431676 


AJ685464 


437210 


AA311443 


451900 


AB023199 


445800 


AA126419 


412368 


AW945992 


409055 


AW304028 


408763 


W57550 


446734 


AUM9278 


413551 


BE242639 


421913 


AI934365 


452712 


AW838616 


451468 


AW503398 


406038 


Y14443 


424909 


S78187 


434078 


AW880709 


415254 


AI815831 


418196 


AI745649 


410020 


T86315 


411352 


NM.002890 


429848 


AF145439 


413729 


6E159999 


400125 




420319 


AW406289 


448272 


AI479094 


422695 


AA315158 


424565 


AW1 02723 


458048 


H30340 


408894 


AI935400 


454093 


AW860158 


410889 


X91662 


457751 


AI908236 


455131 


AW857913 


408364 


AW015238 


425907 


AA365752 


402359 




401044 




409877 


AW502498 


423690 


AA329648 



gbrHuman T-cell receptor active beta-cha 
Hs.156307 ESTs 
Hs.1 18567 ESTs 
Hs255609 ESTs 
Hs.104558 ESTs 

Hs.159578 Homo sapiens mRNA for FU00020 protein, 
Hs.88672 ESTs 

Hs.118578 Ksapiens mRNA for ribosomal protein L1 8 
Hs.143535 calcium/caimodulin-dependent protein kin 
Hs.60640 ESTs 

gb:RC1-HT0598-31030M12-f07 HT0598 Homo 
Hs.15617 ESTs, Weakly similar to ALU4_HUMAN ALU S 
Hs.6686 ESTs 

Hs293581 ESTs 

Hs.78776 Human putative transmembrane protein (nm 
Hs.30875 ESTs 

Hs239147 guanine deaminase 
Hs.301663 ESTs 

Hs.151 738 matrix metadoproteinase 9 (gelatrnase B 

gb:601441677F1 NIH_MGC_65 Homo sapiens c 
Hs.159581 matrix metalloproteinase 17 (membrane-in 
Hs.173031 ESTs 

Hs283676 ESTs, Weakly similar to unknown protein 

gb:RC-BT029-100199-117 BT029 Homo sapien 
Hs297647 ESTs, Moderately similar to calcium chan 

Hs.127699 EST cluster (not in UniGene) 
Hs.85050 phosphotamban 
Hs292638 ESTs 

Hs293563 Homo sapiens mRNA; cDNA DKFZp586E2317 (f 
HS27207 KIAA0982 protein 
Hs.301632 ESTs 

Hs.181125 immunoglobulin lambda locus 
Hs.300578 ESTs 

Hs.301526 Homo sapiens cDNA RJ13181 fis, clone NT 
Hs/16074 Homo sapiens mRNA; cDNA DKFZp564t153 (fr 
Hs.75425 ubiquitin associated protein 
Hs.109439 osteoglytin (osteoinductive factor, mime 

gb«HC5-LT0054-1402QM13-D01 LT0054 Homo 
Hs210047 ESTs 
Hs.88219 zinc finger protein 200 
Hs.153752 cell division cycle 25B 
Hs283683 EST 
Hs.184378 ESTs 

Hs26549 ESTs, Weakly similar to T00066 hypotheti 
Hs.728 ribonucJoase, RNase A family, 2 (Tiver, 
Hs.758 RAS p21 protein activator (GTPase activa 
Hs22594$ chemokine (C-C motif) receptor 9 

gb^V14HT0412-270300-123Kl10HT0412 Homo 

Hs.96593 hypothetical protein 
Hs.170786 ESTs 

gb:EST186956 HCC ceO line (matastasis t 
Hs.75295 guanylate cyclase 1 , soluble, alpha 3 
Hs.173705 Homo sapiens cDNA: RJ22050 fis, done H 
Hs.217286 ESTs 

gb:RC0-CT0379-2901 Q0-O32-D04 CT0379 Homo 
Hs.66744 twist (Orosophila) homoiog (acrocephalos 

gb:IL-BT166-180399-010 BT166 Homo sapien 

gb:RCO-CT0323-231199-031-b05 CT0323 Homo 
Hs.128453 ESTs 
Hs.155965 ESTs 



Hs.157150 ESTs, Weakly similar to zinc finger prot 
Hs23804 ESTs 



1526 
1525 
1524 
1522 
1522 
15.16 
15.14 
15.14 
15.12 
15.12 
15.12 
15.06 
15.03 
15.02 
15.00 
15.00 
14.96 
14.90 
14.84 
14.80 
14.72 
14.69 
14.67 
14.65 
14.63 
14.60 
14.55 
1454 
14.46 
14.44 
14.42 
14.40 
14.36 
14.36 
14.32 
14.31 
1423 
1422 
1422 
1422 
1422 
1422 
14.16 
14.14 
14.07 
14.07 
14.05 
14.02 
13.98 
13.98 
13.95 
13.90 
13.88 
13X5 
13X0 
13,80 
13.78 
13.78 
1376 
13.75 
13.74 
13.72 
13.69 
13.67 
13.62 
13.60 
1353 
1353 
13.49 
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430685 A1690234 Hs.191666 ESTs, Weakly similar to reverse transcri 13.47 

414052 AW578849 Hs283552 ESTs, Weakly similar to unnamed protein 13.46 

447858 AW080339 Hs211911 ESTs 13.44 

435716 AI573283 Hs.38458 ESTs 13.44 

5 439120 H56389 gb.7t87c03.r1 Soaresj)ineaU1andJI3HPG 13.43 

402788 13.40 

451591 AA886446 Hs.146278 ESTs 13.40 

405411 13.38 

426558 AW188574 Hs24218 ESTs 13.34 

10 453506 AA132818 Hs.110407 ESTs, Weakly similar to coded for by C. 13.33 

416445 AL043004 Hs.300678 Human serine/threonine Wnase mRNA, part 13-32 

457084 AI074149 Hs.150905 ESTs, Weakly similar to chondroitin 4-su 1352 

403838 13.32 

427337 Z46223 Hs.176663 Fc fragment of IgG, tow affinity lllb,r 1350 

15 434318 AW207552 Hs.1 16328 ESTs, WeaWy similar to dJ134E1 5.1 (H^a 1328 

435193 N41359 Hs218107 ESTs 1328 

414756 AW451101 Hs.159489 ESTs, Moderately similar to hexokinase I 1327 

420626 AF043722 Hs.99491 RAS guanyi releasing protein 2 (calcium 1326 

420052 AA416850 Hs.44410 ESTs 1325 

20 414020 NM 002984 Hs.75703 smaD Inducible cytokine A4 (homologous 1325 

403851 1324 

422647 W07492 Hs.157101 ESTs 1321 

433598 AI762836 Hs271433 ESTs, Moderately similar to ALU2_HUMAN A 1321 

409065 AB033113 Hs50187 KIAA1287 protein 1320 

25 435063 R21966 Hs.57734 G protein-coupled receptor kinasenntera 13.19 

439367 BE386844 Hs248746 ESTs 13.17 

451957 AI796320 Hs.10299 Homo sapiens cDNA RJ13545 fis, ctane PL 13.16 

420569 AA278362 Hs289062 Homo sapiens cDNA FU12334 fis, clone MA 13.14 

447883 BE262802 Hs.4909 dickkopf (Xenopus laevis) homolog 3 13.07 

30 426490 NMJ001621 Hs.170087 aryl hydrocarbon receptor 13.06 

414789 AA155859 Hs.79708 ESTs 13.05 

451418 BE387790 Hs26369 ESTs 13.04 

443494 T99719 Hs270404 Homo sapiens cDNA: FU22389 fis, clone H 13.03 

425878 AW964806 Hs.38085 ESTs, Weakly similar to putative glycine 13.02 

35 431912 AI660552 Hs.1 54903 ESTs, Weakly similar to A561 54 Abisubst 13.00 

407122 H20276 Hs.31742 ESTs 13.00 

456491 AL137468 Hs.97277 Homo sapiens mRNA; cDNA DKFZp434H1322 (f 12.99 

448172 N75276 Hs.135904 ESTs 1258 

452144 AA032197 Hs.102558 ESTs 1256 

40 419953 BE267154 Hs.125752 ESTs 1256 

416182 NM_004354 Hs.79069 cydinG2 1254 

451154 AA015879 Hs53536 ESTs 1253 

412257 AW903830 gb^M4^N1037-250400-155-h04NN1 037 Homo 1253 

449764 AW161319 Hs.12915 ESTs 1252 

45 432695 D63480 Hs278634 KIAA01 46 protein 1252 

454105 NM_001259 Hs58481 cycTrrHlependent kinase 6 1252 

439093 AA534163 Hs5476 serine protease inhibitor, Kazal type, 5 1250 

416098 H41324 Hs51581 ESTs, Moderately similar to ST1B_HUMAN S 1258 

424897 D63216 Hs.153684 (rizzled-related protein 1258 

50 414604 AU076649 Hs.76556 growth arrest and DNA-damage-inducible 3 1258 

414664 AA587775 Hs.66295 Homo sapiens HSPC311 mRNA, partial cds 1254 

452560 BEO77084 gbflC5-BT0603-220200-013-C07 BT0603 Homo 1254 

413869 NMJ500878 Hs.75596 Interieukin 2 receptor, beta * 1250 

452359 BE167229 Hs29206 Homo sapiens clone 24659 mRNA sequence 1250 

55 435886 BE265839 Hs.12126 hepatocellular rarcinoma-associated anti 12.78 

445230 U97018 Hs.12451 echinoderm mlcrotubute-associated protei 12.78 

412226 W26786 gb:15d7 Human retina cDNA randomly prime 12.77 

446619 AU076643 Hs513 secreted phosphoprotein 1 (osteopontin, 12.76 

447769 AW873704 Hs.48764 ESTs 12.76 

60 414478 AI306389 Hs.76240 adenylate kinase 1 12.76 

425383 D83407 Hs.156007 Down syndrome critical region gene 1-fflc 12.68 

450704 H85157 Hs.40696 ESTs 1256 

405856 12.66 

412935 BE267045 Hs.75064 tubulin-specffic chaperone c 12.65 

65 402802 12.62 

452588 AA889120 Hs.1 10637 HomeoboxAlO 12.62 

419978 NMJXJ1454 Hs53974 forkheadboxJI 12.62 

403137 12.60 

430226 BE245562 Hs2551 adrenergic, beta-2-, receptor, surface 1257 
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448076 AJ133123 Hs20196 adenylate cyclase 9 1256 
450462 F07097 Hs.300828 Homo sapiens mRNA full length insert cON 1254 
405236 1252 
409292 AA071051 gb:zm58e05.s1 Slratagene fibroblast (937 12.47 

5 421540 AA767669 Hs.10242 ESTs 12.47 
425840 AW978731 Hs.30,1824 ESTs 12.44 
443181 AI039201 Hs.54548 ESTs 12.42 
452436 BE077546 Hs.31447 ESTs 12.42 
455183 AW984111 gb:RCO-HN0007-16030W)11-{09HN0007Homo 12.40 

10 432887 AI926047 Hs.162859 ESTs 1237 

410494 M36564 Hs.64016 protein S (alpha) 12.36 

439024 R96696 Hs.35598 ESTs 12.36 

451246 AW1 89232 Hs.39140 cutaneous T-cell lymphoma tumor antigen 1236 

432892 AL042615 H&15995 ESTs 1235 

15 418982 AI348838 Hs.13073 ESTs 1235 

414516 A1307802 HS279551 ESTs 1234 
440134 BE410734 gb:601301619F1 NIHJUGC__21 Homo sapiens c 1229 

443873 AL048542 Hs.16291 ESTs 1228 
401286 1226 

20 454020 AW962845 Hs256527 ESTs 1224 

420077 AW512260 Hs.87767 ESTs 1224 

443837 AI984625 Hs.9884 spindle pole body protein 1224 

407519 X64979 gb:H.sapiens mRNA HTPCRX01 for olfactory 1223 

435839 AF249744 Hs25951 Rho guanine nucleotide exchange factor ( 1222 

25 448552 AW973653 Hs20104 hypothetical protein RJ00052 1220 

405325 1220 

451009 AA013140 Hs.115707 ESTs 12.18 

423066 Y18264 Hs.120171 ESTs 12.17 

439556 AI623752 Hs.163603 ESTs 12.16 

30 443062 N77999 Hs.8963 Homo sapiens mRNA full length insert cDN 12.15 

445873 AA250970 Hs.251946 Homo sapiens cDNA: FU23107 fis, clone L 12.14 

453542 AW836724 Hs.33190 Homo sapiens mRNA expressed only in plac 12.11 

440106 AA864968 Hs.127699 ESTs 12.10 

417605 AF006609 Hs.82294 regulator of G-protein signalling 3 12.10 

35 440286 U29589 Hs.7138 cholinergic receptor, muscarinic 3 12.04 

420061 AW024937 Hs29410 ESTs 12.02 

458727 AI022813 Hs.92679 Homo sapiens clone COABP0014 mRNA sequen 11.96 

445407 AI222658 Hs221889 ESTs, Weakly similar to la costa [D.mela 1155 

418250 U29926 Hs.83918 adenosine monophosphate deaminase flsofo 11.94 

40 414129 AI990287 Hs270798 ESTs 1153 

409799 D11928 Hs.76845 phosphosenne phosphatase-fike 11.92 

438461 AW075485 Hs286049 phosphosenne aminotransferase 11.92 

443912 R37257 Hs.184780 ESTs 11.92 

424606 AA343936 gb:EST49786 Gall bladder I Homo sapiens 1150 

45 434217 AW014795 Hs23349 ESTs 1150 

451533 NMJXM657 Hs26530 serum deprivation response (phosphatidyl 11.90 

422423 AF283777 Hs.116481 C072antigen 1139 

409398 AW386461 gb:PM4-PT0019-121299^K)4-F02PT0019Homo 1139 

423853 AB01 1537 Hs.133466 sfrt (Drosophila) homolog 1 1132 

50 446180 AI074413 Hs.14220 hypothetical protein FU20450 1130 

414341 D80004 Hs.75909 KIAA0182 protein 1130 

406538 11.79 

433253 AW450502 Hs24218 ESTs - 11.79 

447397 BE247676 Hs.18442 E-1 enzyme 11.78 

55 451684 AF216751 Hs26813 CDA14 11.76 

416862 R23765 Hs23575 ESTs 11.74 

425770 NMJM4363 Hs.159492 spastic ataxia of Charievoix-Saguenay (s 11.72 

428826 AL048842 Hs.194019 attractin 1172 

433037 NMJM4158 Hs279938 HSPC067 protein 1172 

60 447476 BE293466 Hs20880 ESTs 1172 

452092 BE24S374 Hs27842 hypothetical protein FU1 1210 1172 

412922 M60721 Hs.74870 H2.0 (Drosophila)-like homeo box 1 11.72 

401680 NMJ505578 Hs.180398 UM domain-containing preferred transtoc 11.69 

422576 BE548555 Hs.1 18554 CGI-83 protein 11.68 

65 450203 AF097994 Hs301528 L^urenine/a|pha-an^oa(fipateaminotra 11.68 

410531 AW752953 gb:QV(KJT0224.2610994)35-g02 CT0224 Homo 11.67 

425917 W28517 Hs.117167 Homo sapiens cDNA: FU23067 fis, clone L 11.66 

418693 AI750878 Hs37409 thrombospondin 1 11.64 

400557 11.62 
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416188 BE157260 Hs.79070 v-myc avian rriyelocytornatosis viral oncog 11j60 

419047 AW952771 Hs.90043 ESTs 1159 

420441 AI986160 Hs.88446 ESTs 1159 

4008B5 1157 

5 409853 AW502327 gb:UW-BR0p^rta-a-07-0-Ul.r1 NIH_MGC_5 1156 

400802 1156 

434540 NM_016045 Ms5184 TH1 drosophila homolog 1155 

431449 M55994 Hs256278 tumor necrosis factor receptor superfami 1155 

425928 S55736 Hs238852 ESTs, Weakly similar to hypothetical pro 1154 

10 434701 AA460479 Hs.4096 KIAA0742 protein 1153 

434228 Z42047 Hs283978 ESTs; KIAA0738 gene product 1152 

420729 AW964897 H&290825 ESTs 1152 

428328 AA426080 Hs.98489 ESTs 1150 

433887 AW204232 Hs279522 ESTs 1150 

15 414812 X72755 Hs.77367 monokine induced by gamma interferon 11.46 

457718 F18572 Hs22978 ESTs 11.44 

452260 AA453208 Hs28726 RAB9, member RAS oncogene family 11.42 

459029 AA131376 Hs285203 fibroblast growth factor 12 11.42 

456267 AI127958 Hs53393 cysiatinE/M 11-39 

20 433285 AW975944 Hs237396 ESTs 11-38 

449186 AW291876 Hs.196986 ESTs 11-37 

447861 AI434593 Hs.164294 ESTs 11-37 

456023 R00028 gb:ye70a06.s1 Soares fetal liver spleen 1136 

439444 A1277652 Hs54578 ESTs 1131 

25 401163 1131 

430886 L36149 Hs2481 16 chemokine (C motif) XC receptor 1 1 1 28 

450784 AW246803 Hs.47289 ESTs 11-28 

452391 AL044829 Hs29331 carnitine palmitoyltransferase I, muscle 11.27 

449625 NMJH4253 Hs23796 odz (odd Oz/terwn, Drosophila) homolog 1 1126 

30 456827 AA075687 Hs.147176 epidermal growth factor receptor substra 1124 

439328 W07411 Hs.118212 ESTs, Moderately similar to ALU3_HUMAN A 1124 

432093 H28383 gb:yt52cQ3.r1 Soares breast 3NbHBst Homo 1 1 24 

407335 AA631047 Hs.158761 Homo sapiens cONAFU13054fis, done NT 1123 

442501 AA315267 Hs23128 ESTs 1122 

35 429746 AJ237672 Hs214142 5, 10-methylenetetrahydrofolate reductase 1121 

422858 R35398 gb:yg64g10j1 Soares infant brain 1 NIB H 1120 

415156 X84908 Hs.78060 phosphorylase kinase, beta 1120 

446713 AV660122 Hs282675 ESTs 1120 

452221 C21322 Hs.11577 ESTs 1120 

40 418261 W78902 Hs293297 ESTs 11.17 

433332 AI367347 Hs.127809 ESTs 11.16 

434539 AW748078 Hs214410 ESTs 11.16 

413471 BE142098 gb:CM4-HT0137-220999-017-d11 HT0137Homo 11.14 

410037 AB020725 Hs58009 K1AA09 18 protein 11.14 

45 405601 11.13 

458332 AI000341 Hs220491 ESTs 11.12 

427654 AA410183 Hs.137475 ESTs 11.12 

427138 N77624 Hs.173717 phosphatide acid phosphatase type 2B 11.10 

431475 AI567669 Hs287316 ESTs 11.10 

50 425710 AF030880 Hs.159275 solute carrier family, member 4 11-08 

413748 AW104O57 Hs.19193 ESTs 11.07 

409208 Y00093 Hs51077 integrin, alpha X (antigen CD11Cft)150) f 1137 

457278 W92745 Hs.193324 ESTs - 11j03 

407021 U52077 gtoHuman marinerl transposasegene.comp 11.02 

55 445701 AF055581 Hs.13131 lymphocyte adaptor protein 11X32 

408338 AW867079 gb^lR1-SN0033-12O4O(H)02-c10SN0O33Homo 1055 

401030 BE382701 Hs25960 v-myc avian myelocytomatosis viral relat 10.95 

437891 AW006969 Hs.6311 hypothetical protein FU20859 10.94 

453874 AW591783 Hs.36131 collagen, type XIV, alpha 1 (undulin) 10.94 

60 421562 AA530994 Hs.105803 ghrelin precursor 1052 

413431 AW246428 Hs.75355 ubiquitin-conjugating enzyme E2N (homoto 1052 

400132 1052 

436420 AA443966 Hs31595 ESTs 1050 

424880 NM.000328 Hs.153614 retinitis pigmentosa GTPase regulator 10.88 

65 433264 D85782 Hs.3229 cysteine ^oxygenase, type I 10.88 

429842 AI366213 Hs.173422 KIAA1605 protein 1037 

412405 AW948126 gb:Ra>-MT0013-280300<)31-a12MT0013Homo 10.85 

400615 1030 

425018 BE245277 Hs.154196 E4F transcription factor 1 1030 
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456011 BE243628 gb:TCBAP1D1053 Pediatric pre-B cell acut 1079 

455982 BE176862 gb:RO4fr0587-1703Q0-012-a04 HT0587 Homo 10.74 

450418 BE218418 Hs201802 ESTs 10.73 

412490 AW803564 Hs288850 ESTs 10.72 

5 436962 AW377314 Hs5364 DKFZP564I052 protein 10.70 

437743 AI383497 Hs.13181 1 ESTs, WeaJdy similar to ALU1_HUMAN ALU S 10.70 

449967 R40978 Hs271498 ESTs, Moderately simitar to ALU^HUMAN A 10.70 

449590 AA694O70 Hs268835 ESTs 10.68 

446035 NMJJ06558 Hs.13565 Sam68-Bke phosphotyrosine protein, T-ST 10.68 

10 426530 U24578 Hs.170250 complement component 4A 10.66 

428600 AW863261 Hs.15036. ESTs, Highly similar to AF1 61 358 1 HSPC0 1054 

420090 AA220238 Hs.94986 ribonudease P (38kD) 10.64 

451593 AF151879 Hs26706 CGI-121 protein 1052 

438893 AF075031 Hs29327 ESTs 10.62 

15 459324 AW080953 gb3cc28c12j(1 NCLCGAP_Co18 Homo sapiens 10.61 

439883 AL359652 Hs.171096 Homo sapiens EST from clone DKFZp434A041 1058 

406513 AA715328 HS291205 ESTs 1057 

407826 AA128423 Hs.40300 caipaln 3, (p94) 1057 

419550 D50918 Hs.90998 KIAA0128 protein; septin 2 1056 

20 428522 R10184 Hs.191987 ESTs, Weakly similar to ALU1.HUMAN ALUS 1056 

459526 AI142350 Hs.146735 EST 1055 

411448 AA178955 Hs271439 ESTs 1054 

410102 AW248508 Hs279727 ESTs; 1052 

406577 1052 

25 408405 AK001332 Hs.44672 hypothetical protein RJ1 0470 1051 

428966 AP059214 Hs.194687 cholesterol 25-hydroxylase 1050 

400880 10.48 

415875 AA894876 Hs5687 protein phosphatase 1B (formerly 2C), ma 10.48 

434715 BE005346 HsJ 16410 ESTs 10.46 

30 406851 AA609784 Hs. 180255 major histocompatibility complex, class 10.44 

413409 AI638418 HS21745 ESTs 10.44 

418489 U76421 Hs.85302 adenosine deaminase, RNA-specific, B1 (h 10.44 

419465 AW500239 Hs.21187 Homo sapiens cDNA: FU23068 tis, clone L 10.44 

419544 AI909154 gb^V-BT20(M)10499^)07BT200Homosapien 10.44 

35 432180 Y18418 Hs272822 RuvB (E ooli homotogj-like 1 10.44 

413822 R08950 Hs272044 ESTs, Weakly similar to ALU1_HUMAN ALU S 10.42 

437446 AA788946 Hs.16869 ESTs, Moderately similar to CA1C RAT COL 10.41 

415701 NMJW3878 Hs.78619 gamma-glutamyl hydrolase (conjugase, fol 1041 

443790 NM_003500 Hs.9795 acykCoenzyme A oxidase 2, branched chai 10.40 

40 458873 AW150717 Hs296176 STAT Induced STAT Inhibitor 3 10.38 

415082 AA160000 Hs.137396 ESTs 1037 

429124 AW505086 Hs.196914 minor hlstocompatibirity antigen HA-1 10.36 

417187 AB011151 Hs.81505 WAA0579 protein 1034 

426827 AW067805 Hs.172665 methylenetetrahydrofolate dehydrogenase 1034 

45 424280 NMJD00030 Hs271 366 alanine-glyoxylate aminotransferase homo 1033 

446099 T93096 Hs.17126 ESTs 1032 

423445 NM_014324 Hs.128749 alpha-methyiacyi-CoA racemase 1031 

409995 AW960597 Hs.30164 ESTs 1030 

432242 AW022715 Hs.162160 ESTs, WeaWy similar to ALU4_HUMAN ALU S 1030 

50 406394 AA172106 Hs.1 10950 Rag C protein 1030 

406189 1029 

422283 AW411307 Hs.1 14311 CDC45 (cell division cycle 45, S.cerevis 1026 

401598 AA172106 Hs.110950 RagC protein - 1026 

456995 T89832 Hs.170278 ESTs 1026 

55 416511 NM_006762 Hs.79356 Lysosomakassociated multlspanning membr 1024 

427274 NM_0Q5211 Hs.1 74142 colony stimulating factor 1 receptor, fo 1024 

401384 1023 

456226 013168 Hs.82002 endotheRn receptor type B 1022 

426928 AF037062 Hs.172914 retinol dehydrogenase 5 (11-dsand9-cis 1021 

60 423032 AI684746 Hs.1 19274 ESTs 1020 

436556 AI364997 Hs.7572 ESTs 1020 

418400 BE243026 Hs.301989 KIAA0246 protein 10.19 

437401 AA757196 Hs.121190 ESTs 10.19 

403690 10.17 

65 423790 BE152393 gb£M2-HT()323-171199<)33^08HT0323Homo 10.16 

434094 AA305599 Hs238205 hypothetical protein PRO201 3 10.16 

434967 AW975009 Hs292274 ESTs 10.16 

432827 Z68128 Hs.3109 Rho GTPase activating protein 4 10.16 

432660 AI288430 Hs.64004 ESTs 10.14 

182 



WO 02/30268 



PCT/US01/32045 



452234 AW084176 Hs.223296 ESTs 10.14 

445629 AI24S701 gb:qk31f05jc1 NCLCGAP_Kld3 Homo sapiens 10.13 

457236 AA626142 Hs.1 79991 ESTs, Weakly similar to KPCEJiUMAN PROTE 10.13 

444605 A1174603 Hs254105 enolase 1, (alpha) 10.12 

5 450313 AI036989 Hs.24809 hypothetical protein FU10826 10.12 

407482 NM_006056 10.12 

449971 AA807346 Hs.288581 Homo sapiens cDNA FU14296 fis, clone PL 10.11 

441201 AW1 18822 Hs.1 28757 ESTs 10.10 

435157 AW014605 Hs.179872 ESTs 10.10 

10 417308 H60720 Hs.81892 K1AA01 01 gene product 10.09 

442582 AI204266 Hs.179303 ESTs 10.05 

437252 AI433833 Hs.164159 ESTs, Weakly similar to ALU1_HUMAN ALU S 10.04 

448663 BE614599 Hs.106823 H.sapiens gene from PAC 426 16, similar t 10.04 

434467 BE552368 Hs.231853 Homo sapiens cDNA RJ13445 fis, done PL 10.04 

15 423698 AA329796 Hs.1098 DKFZp434J1813 protein 10.02 

412707 AW206373 Hs.16443 Homo sapiens cDNA: FU21721 fis, clone C 10.00 

414658 X58528 Hs.76781 ATP-binding cassette, sub-family D (ALD) 10.00 

421832 NMJH6098 Hs.108725 HSPC040 protein 10.00 

423554 M90516 Hs.1674 glutamine-fructose-6^hosphatetransamin 10.00 

20 452039 AI922988 Hs.172510 ESTs 10.00 . 

434673 AW137442 Hs.136965 ESTs 10.00 

427978 AA418280 Hs.180040 Homo sapiens cDNA FU22439 fis, done H 10.00 

457803 BE501815 Hs.198011 ESTs 9.99 

426279 AA425310 Hs.155766 ESTs 9.98 

25 444412 AI147652 H&216381 Homo sapiens clone HH409 unknown mRNA 9.98 

417049 N72394 Hs.44862 ESTs 9.96 

427509 M62505 Hs.2161 complement component 5 receptor 1 (C5a I 9.96 

445424 AB028945 Hs.12696 cortactin SH3 domain-binding protein 9.96 

443678 AW009605 Hs.231923 ESTs 9.96 

30 447567 AW474513 Hs.224397 ESTs, Weakly similar to B4801 3 profine-r 9.94 

414709 AA704703 Hs.77031 Sp2 transcription factor 9.94 

434596 T59538 gb:yb65g12.s1 Stratagene ovary (937217) 9.94 

427630 BE276115 Hs.144980 ESTs, Weakly similar to CA13_HUMAN COLLA 9.93 

416111 AAD33813 Hs.79018 chromatin assembly factor 1 , subunit A ( 9.92 

35 423349 AF01Q258 Hs.127428 homeoboxA9 9.92 

424308 AW975531 Hs.1 54443 minichromosome maintenance deficient (S. 9.92 

416814 AW192307 Hs.80042 doIichyI-P^ic:Man9Glc{^(2-PP^olichylgi 9.90 

417986 AA481003 Hs.97128 ESTs 9.90 

425174 D87450 Hs.154978 KIAA0261 protein 9.90 

40 438171 AW976507 Hs.293515 ESTs 9.90 

421984 AW972187 Hs.1 10443 hypothetical protein FU2221 5 9.89 

408597 NM.005291 Hs.46453 G protein-coupled receptor 17 9.88 

413907 AI097570 Hs.71222 ESTs 9.87 

451296 AW801383 Hs.1 18578 H.sapiens mRNA for ribosomal protein L18 9.86 

45 433409 AI278802 Hs.25661 ESTs 9.85 

450360 AW117416 Hs.245484 ESTs 9.85 

433104 ALO43002 Hs.128246 ESTs, Moderately simSar to unnamed prot 9.84 

449824 AI962552 Hs.226765 ESTs 9.84 

452744 AI267652 Hs.30504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 9.82 

50 431066 AF026273 Hs.2491 75 tnterteukin-1 receptor-associated kinase 9.82 

426457 AW894667 Hs.169965 chlmerin (chimaerin) 1 9.80 

443371 AI792888 Hs.145489 ESTs 9.80 
437159 AL050072 gb:Homo sapiens mRNA; cDNA DKFZp566E1346 - 9.75 

425242 D13635 Hs.155287 KIAA0010 gene product 9.74 

55 447498 N67619 Hs.43687 ESTs 9.74 

426759 AI590401 Hs.21213 ESTs 9.73 

435129 AI381659 Hs.267086 ESTs 9.72 

437672 AW748265 Hs.5741 flavohemoproteb b54l)5R 9.72 

438209 AL120659 Hs.6111 KIAA0307 gene product 9.72 

60 438440 AA807228 Hs.225161 ESTs 9.72 

449720 AA311152 Hs.288708 ESTs; Weakly similar to KIAA0226 [H.sapl 9.72 

414291 AI289619 Hs.13040 ESTs 9.72 

436206 AK001451 Hs.265561 CD2-assodated protein 9.70 

446896 T15767 Hs.22452 Homo sapiens cDNA: FU21084 fis, clone C 9.70 

65 412667 AW977540 Hs.269254 ESTs 9.70 

423301 S67580 Hs.1645 cytochrome P450. subfamily IVA, polypept 9.67 

440757 AW118645 Hs.160004 ESTs 9.67 

441412 AI393657 Hs.159750 ESTs 9.66 

421044 AF061871 Hs.101302 coflagen, type XII, alpha 1 9.66 
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414726 BE466863 H&280099 ESTs 9.66 

418485 R91679 Hs.124981 ESTs 9.66 

433480 X02422 Hs.181125 immunoglobulin lambda locus 9.65 

441530 AI248301 Hs.127112 ESTs 9.65 

5 433533 053304 Hs.65394 ESTs 9.65 

421470 R27496 Hs.1378 annexinA3 9.64 

438613 C05569 Hs.243122 hypothetical protein FU13057 similar to 9.64 

429324 AA488101 Hs.199245 inactivation escape 1 9.62 

450244 AA007534 Hs.125062 ESTs 9.62 

10 407660 AW063190 Hs.279101 ESTs 9.61 

406554 9.60 

426404 AA377607 Hs.273138 ESTs 958 

447045 AW392394 H&278569 KIAA0064 gene product 9.58 

449894 AK001578 Hs.24129 hypothetical protein FU1071 6 958 

15 448376 AI494332 Hs.196963 ESTs 958 

407902 AL1 17474 Hs.41181 Homo sapiens mRNA; cONA DKFZp727C191 (fr 956 

446572 AV659151 Hs£82961 ESTs 9.56 

459245 BE242623 Hs.31939 manic fringe (Drosophila) homolog 955 

423545 AP000692 Hs.1 29781 chromosome 21 open reading frames 954 

20 414697 BE266134 Hs.76927 translocase of outer mitochondrial msmbr 954 

410846 AW807057 gb:MR4-ST0062-Q31 199418-003 ST0062 Homo 952 

421181 NMJXJ5574 Hs.1 84585 UM domain only 2 (mombotin-fike 1) 952 

427308 D26067 Hs.174905 K1AA0033 protein 952 

415995 NM.004573 Hs.994 phospholipase C, beta 2 951 

25 434846 AW295389 Hs.1 19768 ESTs 951 

414342 AA742181 Hs.75912 Homo sapiens cDNA: FU22199 fis, clone H 950 

416959 D28459 Hs.80612 ubtqiritin-conjugating enzyme E2A (RAD6 h 950 

443123 AA094538 Hs.6588 ESTs 9.50 

439312 AA833902 Hs.270745 ESTs 9.48 

30 449375 R07114 Hs.271224 ESTs 9.48 

436357 AJ1 32085 gb:Homo sapiens mRNA for axonemal dynein 9.44 

458723 AW1 37726 Hs.244352 ESTs, Moderately similar to laminin atph 9.44 

457526 AW450584 Hs.1 92131 ESTs, Weakly similar to RIBB [H^apiens] 9.43 

404741 9.43 

35 422409 NM_005428 Hs.1 16237 vavl oncogene 9.43 

403708 9.42 

408806 AW847814 Hs.289005 Homo sapiens cONA: FU21532 fis, clone C 9.42 

417380 T06809 gb£ST04698 Fetal brain, Stratagene (cat 9.42 

422501 AA354690 Hs.144967 ESTs 9.42 

40 426197 AA004410 Hs.167835 acyKtenzyme A oxidase 1,palmHoyl 9.42 

452624 AU076606 Hs,30054 coagulation factor V (proaccelerin, labi 9.42 

412110 AW893569 p^^CT>NN0021-040400<)21-c10NN0021 Homo 9.41 

414158 AA361623 Hs.288775 Homo sapiens cONA FU1 3900 fis, clone TH 9.41 

408101 AW968504 Hs.1 23073 COC2-related protein kinase 7 9.40 

45 414171 AA360328 Hs.865 RAP1 A, member of RAS oncogene family 9.40 

415947 U04O45 Hs78934 mutS (E coli) homolog 2 (colon cancer, 9.40 

426959 BE262745 gb.*601 153869F1 NIH_MGC_19 Homo sapiens c 959 

417519 AI689987 Hs.177669 ESTs, Weakly similar to RMS1.HUMAN REGUL 959 

457181 BE514362 Hs.296422 FK506-binding protein 3 (25kD) 959 

50 402835 958 

404632 9.38 

446566 H95741 Hs.17914 Homo sapiens cONA: FU22801 fis, clone K 957 
455369 AW903533 gb£M1-NN1 031 -060400-1 78-d05 NN1031 Homo * 9.37 

444001 AI095087 Hs.1 52299 ESTs, Moderately similar to ALU5_HUMAN A 9.36 

55 458191 AI420611 Hs.127832 ESTs 9.36 

431374 BE258532 Hs.251871 CTP synthase 954 

429327 AA283981 Hs.199248 prostaglandin E receptor 4 (subtype EP4) 9.33 

407061 X97748 gbHsapiens PTX3 gene promotor region. 9.33 

416967 BE616731 Hs.80645 interferon regulatory factor 1 9.33 

60 423013 AW875443 Hs.22209 secreted rriodular calcium-binding protein 9.33 

439461 AA693960 Hs.103158 ESTs 9.33 

418830 BE513731 Hs58959 Human DNA sequence from clone 967N21 on 9.32 

422763 AA033699 Hs.83938 ESTs, Moderately,similar to MASP-2 [H.sa 9.32 

442739 NM.007274 Hs5679 cytosolic acyl coenzyme A thioester hydr 9.32 

65 452859 AI300555 Hs.288158 Homo sapiens cONA: FU23591 fis, clone L 952 

403237 9.32 

415000 AW025529 Hs.239812 ESTs, Weakly similar to CALM.HUMAN CALMO 951 

417951 AW976410 Hs.289069 Homo sapiens cDNA:FLJ21016 fis, clone C 9.30 

419066 298492 Hs.6975 PRO1073 protein 950 
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448443 AW167128 Hs231934 ESTs 9.30 

405125 9.30 

409768 AW499566 ob^JI-HF-BROHj-h-03-(HJKr1 NIH.MGCJ 928 

453708 AI191811 Hs.54629 ESTs 928 

5 442271 AF000652 Hs.8180 syndecan binding protein (syntenin) 927 

410055 AJ250839 Hs.58241 gene for serine/threonine protein kinase 926 

448692 AW013907 Hs224276 ESTs, Moderately similar to predicted us 926 

417381 AF164142 Hs.82042 solute carrier family 23 (nucteobase tra 925 

422497 D29642 Hs.1528 KiAA0053 gene product 925 

10 414140 AA281279 Hs23317 ESTs 924 

435980 AF274571 Hs.129142 ESTs; Weakly similar to DEOXYRIBONUCLEAS 924 

458530 BE395035. Hs.199889 ESTs, Weakly similar to KIAA0874 protein 924 

402585 924 

420819 AA280700 gb:zs95h1 1 .s1 NCLCGAPJ5CB1 Homo sapiens 923 

15 444755 AA431791 Hs.183001 ESTs 922 

411630 U42349 Hs.71119 Putative prostate cancer tumor suppresso 922 

421246 AW582962 Hs.300961 ESTs, Highly simifarto AF151805 1 CGI-4 920 

421924 BE514514 Hs.1 09606 coronin, actin-binding protein, 1A 9.19 

414888 AL039185 Ks.77558 thyroid hormone receptor interactor 7 9.18 

20 434267 AI206589 Hs.1 16243 ESTs 9.17 

409213 U61412 Hs.51133 PTK6 protein tyrosine kinase 6 9.17 

428242 K55709 Hs2250 leukemia inhibitory factor (cholinergic 9.16 

451736 AW080356 Hs293684 ESTs, Weakly similar to alternatively sp 9.15 

413627 BE182082 Hs246973 ESTs 9.14 

25 416134 AA528402 Hs.74861 activated RNA polymerase II transcriptio 9.14 

449251 AW151660 Hs.31444 ESTs 9.14 

452813 U54727 Hs.191445 ESTs 9.14 

443622 AI911527 Hs.1 1805 ESTs 9.14 

413260 BE075281 gb:PM1-BT0585-29020XM)05-d07 BT0585 Homo 9.12 

30 413450 Z99716 Hs.75372 N-acetylgalactosaminidase, alpha- 9.12 

446442 BE221533 Hs257858 ESTs 9.12 

438540 AA810021 Hs.136906 ESTs 9.12 

426251 M24283 Hs.1 68383 Intercellular adhesion molecule 1 (CD54) 9.11 

410290 AA402307 Hs.73818 ubiquinokjytochrome c reductase hinge p 9.10 

35 437398 AA913736 Hs.126715 ESTs 9.10 

421559 NM.014720 Hs.105751 Ste20-related serine/threonine kinase 9.10 

439699 AF086534 Hs.187561 ESTs, Moderately similar to ALU1_HUMAN A 9.10 

430799 C19035 Hs.164259 ESTs 9.09 

424544 M88700 Hs.150403 dopa decarboxylase (aromatic L-amlno aci 9.08 

40 453942 AW190920 Hs.19928 ESTs 9.08 

425844 T68073 Hs.1 59628 serine (or cysteine) proteinase inhibito 9.08 

434658 AI624436 Hs.194488 ESTs 9.07 

453999 BE328153 Hs240087 ESTs 9.06 

436490 R71543 Hs.18713 ESTs 9.05 

45 409192 AA065131 Hs233439 ESTs, Weakly similar to ALU7_HUMAN ALU S 9.05 

446223 BE300091 Hs.1 19699 hypothetical protein RJ12969 9.04 

447247 AW369351 Hs287955 Homo sapiens cDNA RJ13090 lis, done NT 9.04 

450094 AI174947 Hs295789 Homo sapiens mRNA; cDNA DKFZp564D1164 (f 9.04 

432012 AW301344 Hs.195969 ESTs 9.04 

50 422520 AU076730 Hs.1 17977 kinesin 2 (60-70kD) 9.02 

418650 BE386750 Hs.86978 prolyl endopeptidase 9.02 

423008 M81590 Hs.123016 5-hydroxytryptamine (serotonin) receptor 9.02 

436476 AA326108 Hs.53631 ESTs - 9.02 

448206 BE622585 Hs.3731 ESTs 9.02 

55 431574 AW572659 Hs261373 adenosine A2b receptor pseudogene 9.01 

443453 R99876 Hs269882 ESTs 9.01 

435472 AW972330 Hs283022 triggering receptor expressed on myeloid 9.01 

420337 AW295840 Hs.14555 Homo sapiens cDNA:RJ21513 fis, clone C 9.00 

449810 AB 006681 Hs23994 activin A receptor, type KB 9.00 

60 406780 AA902386 Hs286 ribosomal protein L4 8.99 

429169 AW341130 Hs.1 97757 ESTs, Moderately similar to FGFELHUMAN F 8.99 

421326 AF051428 Hs.1 03504 estrogen receptor 2 (ER beta) 8.97 

425491 AA883316 Hs255221 ESTs 8.96 

425516 BE000707 Hs29567 ESTs 8.96 

65 439773 AI051313 Hs.143315 ESTs 8.96 

443247 BE614387 Hs.47378 ESTs 8.96 

456623 AI084125 Hs.108106 transcription factor 8.95 

438707 L08239 Hs.5326 porcupine 8.95 

402240 8.95 
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444152 AI125694 Hs.149305 Homo sapiens cDNA FU14264 fis, done PL 8.95 

409842 AW501756 gbAJI-HF-BROp-a]nH>09^Ul.r1 NIHJAGC_5 8.94 

416277 W78765 Hs.73580 ESTs 8.94 

456697 AI908006 Ks.1 11334 ferritin, fight polypeptide 8.94 

5 410762 AF226053 Hs,66170 HSKM-B protein 8.92 

412942 AL120344 Hs.75074 mrtogen-adivated protein kinase-activat 8.92 

442320 AI287817 Hs.129636 ESTs 8.92 

449673 AA002064 Hs.18920 ESTs 8.91 

411486 N85785 Hs.181165 eukaryotic translation elongation factor 8.90 

10 437916 BE566249 Hs.20999 Homo sapiens cDNA: FU23142 fe, clone L 8.90 

442732 AA257161 Hs.8658 hypothetical protein DKFZp434E0321 8<89 

419741 NM 007019 Hs.93002 ubiquittn carrier protein E2-C 8.89 

411499 AW849292 gb:IL3-CT0215-O203(XH)9O-E06 CT0215 Homo 8.89. 

431154 AW971228 Hs.290259 ESTs . 8.89 

15 414922 D00723 Hs.77631 glycine cleavage system protein H (amino 8.88 

418036 Z37976 Hs.83337 latent transforming growth (actor beta b 8.87 

406422 8.87 

422926 NMJJ16102 Hs.121748 ring finger protein 16 8.87 

435220 D50030 Hs.104 HGF activator 8.86 

20 418203 X54942 Hs.83758 COC28 protein kinase 2 8.86 

418613 AA744529 Hs.86575 mitogen-activated protein kinase kinase 8.85 

439250 H66566 Hs.271711 ESTs 8.85 

432359 AA076049 Hs^74415 Homo sapiens cDNA FU 10229 fis, clone HE 8.84 

450000 AI952797 Hs.10888 Homo sapiens cDNA: FU21559 fis, clone C 8.83 

25 425657 T89839 Hs.1 19471 ESTs 8.83 

425694 U51333 Hs.159237 hexokinase 3 (white ceH) 8.82 

419972 AL041465 Hs.294038 ESTs, Moderately similar to ALIKLHUMAN A 8.82 

436396 AI683487 Hs.2991 12 Homo sapiens cDNA FU11441 fis, clone HE 8.82 

413413 D82520 Hs.301834 Homo sapiens cDNA FU10952 lis, clone PL 8.82 

30 428807 AA435997 Hs.104930 ESTs 8.82 

415839 R40611 Hs.137565 ESTs 8.81 

419553 N34145 Hs250614 ESTs 8.80 

420309 AW043637 Hs.21766 ESTs 8.80 

421863 AI952677 Hs.108972 Homo sapiens mRNA; cONA DKFZp434P228 (fr 8.80 

35 447965 AW292577 Hs.94445 ESTs 8.80 

459172 BE063380 gb:PM0-BT0275-291099^KE-g10BT0275Homo 8.80 

403259 8.78 

411534 AW850473 gb:IL3-CT0219-2801(XH)61-B11 CT0219Homo 8.78 

456161 BE264645 Hs.282093 Homo sapiens cDNA: FU21918 fis, clone H 8.77 

40 413654 AA331881 Hs.75454 peroxiredoxfo 3 8.76 

401744 8.76 

425348 AL137477 Hs.155912 cadherin-like 24 8.76 

423396 AI382555 Hs.127950 bromodomain-oontaining 1 8.75 

450649 NMJW1429 H&297722 Human DNA sequence from clone RP1-85F18 8.75 

45 408331 NMJX)7240 Hs.44229 dual specificity phosphatase 12 8.74 

423872 AB020316 Hs.1 34015 urony12-sulfotransferase 8.74 

424906 AI566086 Hs.153716 Homo sapiens mRNA for Hmob33 protein, 3 1 8.74 

427596 AA449506 Hs.179765 Homo sapiens mRNA; cONA DKFZp586H1921 (f 8.73 

432488 AA551010 Hs.216640 ESTs 8.72 

50 448980 AL137527 Hs.22703 Homo sapiens mRNA; cONA OKFZp434P1018 if 8.72 

429455 A1472111 Hs.292507 ESTs 8.71 

429855 AW385597 Hs.138902 ESTs, Weakly similar to B34087 hypotheti 8.71 

441746 H59955 Hs.127829 ESTs * 8J0 

411945 AL033527 Hs.92137 v-myc avian myelocytomatosis viral oncog 8.70 

55 413492 D87470 Hs.75400 KIAA0280 protein 8.70 

435706 W31254 Hs.7045 GUW4 protein 8.70 

433741 AA609019 Hs.159343 ESTs 8.70 

426340 Z97989 Hs.169370 FYN oncogene related to SRC, FGR, YES 8.69 

422779 AA317036 Hs.41989 ESTs 8.67 

60 449785 AI225235 Hs.288300 Homo sapiens cONA: FU23231 fis, clone C 8.67 

420144 AA811813 Hs.1 19421 ESTs 8.66 

420235 AA256756 Hs.31178 ESTs 8.66 

432606 NM_002104 Hs.3066 granzyme K (serine protease, granzyme 3; 8.66 

425762 BE244076 Hs.159578 Homo sapiens mRNA for FU00020 protein, 8.65 

65 427448 BE246449 Hs.2157 Wiskott-Aldrich syndrome (eczema-thrombo 8.64 

418033 W68160 Hs.259855 Homo sapiens cONA FU12507 fis, clone NT 8.64 

429084 AJQ01443 Hs.1 95614 splicing factor 3b, subunit 3, 130kD 8.64 

417094 NM_006895 Hs.81182 histamine N-methyftransferase 8.64 

457277 NMJJ04736 Hs227656 xenotroplc and porytropic retrovirus rec 8.63 
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422631 BE218919 Hs. 11 8793 hypothetical protein RJ 10688 8.63 

410679 AW795196 H&215857 ring finger protein 14 8.63 

431585 BE242803 Hs.262823 hypothetical protein RJ10326 8.62 

401851 8.62 

5 401866 8.62 

407783 AW996872 Hs.172028 a disintegrin and metalloproteinase doma 8,62 

408242 AA251594 Hs.43913 PIBF1 gene product 8.62 

422250 AW408530 Hs. 11 3823 CipX (case inolytic protease X, E.coli) 8.62 

430259 BE550182 Hs.1 27826 RalGEF-fike protein 3, mouse homolog 8.62 

10 452598 A1831594 Hs.68647 ESTs, Weakly similar to ALU7_HUMAN ALU S 8.62 

419541 AW749617 gbiRtt-BT0502-13010M12-g07 BT05Q2 Homo 8.60 

428839 AI767756 Hs.82302 ESTs 8.60 

429328 AA829402 Hs.47939 ESTs 8.60 

451491 AI972094 Hs.286221 Homo sapiens cDNA FU 13741 fis, clone PL 8.60 

15 452561 AI692181 Hs.49169 KIAA1634 protein 8.60 

420027 AF009746 Hs.94395 ATP-binding cassette, sub-family D (ALD) 8.60 

435205 X54136 Hs.1 81 125 immunoglobulin lambda locus 8.60 

430900 U91939 Hs£48123 G protein-coupled receptor 25 8.60 

405074 8.59 

20 437991 AI479773 Hs.181679 ESTs 8.59 

436346 BE328882 Hs.193096 ESTs, Moderately similar to U119_HUMANU 8.58 

411079 AA091228 gb:cchn2152.seq.F Human fetal heart, lam 8.57 

418452 BE379749 Hs.85201 C-type (calcium dependent, carbohydrate- 8-56 

429109 AL008637 Hs.196352. neutrophil cytosolic factor 4 (40kD) 856 

25 448019 AW947164 Hs.195641 ESTs 8.56 

449865 AW204272 Hs.199371 ESTs 855 

431180 H55883 gb:yq94h03.r1 Soares fetal liver spleen 854 

445988 BE007663 Hs.13503 inactivation escape 2 8.54 

405876 8.54 

30 407235 D20569 Hs.169407 SAC2 (suppressor of actin mutations 2, y 8,54 

414807 AI738616 Hs.77348 . hydroxyprostaglandin dehydrogenase 15-(N 8.54 

425671 AF193612 Hs.159142 lunatic fringe (Drosophila) homolog 854 

452413 AW082633 Hs.212715 ESTs 8.54 

421620 AA446183 Hs.91885 ESTs 853 

35 444539 AI955765 Hs.146907 ESTs 852 

415102 M31899 Hs.77929 excision repair cross-complementing rode 851 

405552 851 

418068 AW971155 Hs.293902 ESTs, Weakly similar to prolyl 4-hydroxy 850 

420133 AA426117 Hs.14373 ESTs 850 

40 438887 R68857 Hs565499 ESTs 850 

446468 AI765890 Hs.16341 ESTs; Moderately similar to til! ALU SUB 850 

446585 AV659397 Hs.282948 ESTs 8.50 

441896 AW891873 gb.-CM3-NlT(X)9(H)40500-173-b02 NT0090 Homo 850 

437718 AI927288 Hs.196779 ESTs 8-48 

45 420656 AA279098 Hs.187636 ESTs 8.48 

429303 AW137635 Hs.44238 ESTs 8.48 

450624 AL043983 Hs.125063 Homo sapiens cONA FU13825 fis, clone TH 8.48 

452573 AI907957 Hs287622 Homo sapiens cONA FU14082 fis, ctone HE 8.48 

456341 AA229126 Hs.122647 N-myristoyltransferase 2 8.48 

50 423024 AA593731 Hs.75613 CD36 an^en (collagen type I receptor, 8.47 

446985 AL038704 Hs.156827 ESTs, Weakly similar to ALU1_HUMAN ALUS 8.46 

431778 AL080276 Hs.268562 regulator of Q-protein signalling 17 8.46 
400268 * 8.46 

421828 AW891965 Hs^89109 dimethylarginine dimethytaminohydrolase 8.45 

55 417022 NM 014737 Hs.80905 Ras association (RalGDS/AF-6) domain fam 8.44 

421029 AW057782 Hs.293053 ESTs 8.44 

425171 AW732240 Hs500615 ESTs 8.44 

459070 AI8143Q2 gbw)71c12j(1 NCLCGAP_Lu1 9 Homo sapiens 8/2 

406006 8^2 

60 412643 AW971239 Hs.293982 ESTs 8.42 

424775 AB014540 Hs.153026 SWAP-70 protein 8^2 

446848 AW136083 Hs.195266 ESTs, Weakly similar to S59501 interfero , 8.42 

448043 AI458653 Hs.201881 ESTs 8.41 

407183 AA358015 gb£ST66864 Fetal lung III Homo sapiens 8.40 

65 412324 AW978439 Hs.69504 ESTs 8.40 

419594 AA013051 Hs.91417 topoisomerase (DNA) 11 binding protein 8.40 

430968 AW972630 gb:EST384925 MAGE resequences, MAGL Homo 8.40 

431689 AA305688 Hs.267695 UDP-GalrbetaGlcNAcbeta 1,3-galactosyltr 8.40 

438582 AI521310 Hs.263365 ESTs, Weakly similar to ALU5JUJM AN ALUS 8.40 
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447685 AL122043 Hs.19221 hypothetical protein DKFZpS66G1424 8.40 

459119 AW844498 Hs289052 Homo sapiens LENG8 mRNA, variant C, part 8.38 

400817 8.37 

425265 BE245297 gb:TCBAP1E2482 Pediatric pre-B cell acut 8.37 

5 409385 AA071267 gb:zm61g01.r1 Stratagene fibroblast (937 8.36 

439121 BE047779 Hs.44701 ESTs 8.36 

419968 X04430 Hs£3913 interteukin 6 (interferon, beta 2) 8.36 

408327 AW182309 Hs249963 ESTs, Highly similar to dJ1 170K4.4 [H.sa 8.35 

403976 8.34 

10 448064 AA379036 gb:EST91809 Synovia) sarcoma Homo sapien 8.33 

442914 AW188551 HS39519 Homo sapiens CONA RJ14007 fls, clone Y7 8.33 

428032 AW997704 Hs.1 1493 Homo sapiens cDNA FU13536 fis, clone PL 8.32 

434194 AF1 19847 Hs283940 Homo sapiens PRO1550 mRNA, partial cds 8.32 

458677 AW937670 HS254379 ESTs 8.32 

15 420925 NMJJ15698 Hs.100391 T54 protein 8.30 

416475 T70298 gbryd26g02.s1 Soares fetal Ever spleen 8.30 

416852 AF283776 Hs.80285 Homo sapiens mRNA; cDNADKFZp586C1 723 (f 8.30 

430676 AF084866 gbiHomo sapiens envelope protein RIC-3 ( 8.30 

428455 AI732694 Hs.98520 ESTs 829 

20 435343 AW194962 Hs.199028 ESTs 829 

450783 BE266695 gb:601190242F1 N1H_MGC_7 Homo sapiens cD 829 

404946 828 

422942 AF054839 Ha 122540 t9traspan2 828 

453716 AA037675 Hs.152675 ESTs B2B 

25 437098 AA744488 Hs.132842 ESTs, Moderately similar to ALU^HUMAN A 828 

443907 AU076484 Hs.9963 TYRO protein tyrosine kinase binding pro 827 

401930 AF106069 Hs23168 ubiquitin specific protease 15 826 

446554 AA151730 Hs.301789 ESTs, Weakly similar to similar to C.ele 826 

426290 AB007918 Hs.169182 KIAA0449 protein 825 

30 419904 AA974411 Hs.18672 ESTs 825 

413886 AW958264 Hs.103832 ESTs, WeaWy similar to TRHY_HUMAN TRICH 824 

424738 AI963740 Hs.46826 ESTs 824 

427359 AW020782 Hs.79881 Homo sapiens cONA: FU23006 fis, done L 824 

424534 D87682 Hs.150275 KIAA0241 protein 824 

35 424429 U63830 Hs.146847 TRAF family member-assodated NFKB activ 824 

442604 BE263710 Hs279904 ESTs 822 

442992 AI914699 Hs.13297 ESTs 822 

427210 BE396283 Hs.1 73987 eukaryotic translation initiation factor 822 

457229 BE222450 Hs266390 ESTs 821 

40 423730 AA330214 gb:EST33935 Embryo, 12 week II Homo sapi 821 

411928 AA888624 Hs.19121 adaptor-related protein complex 2, alpha 820 

416051 AA835868 Hs.25253 Homo sapiens cONA: FU20935 fis, done A 820 

417231 R40739 Hs21326 ESTs 820 

422049 W25760 Hs.77631 glydne cleavage system protein H (amino 820 

45 427528 AU077143 Hs.1 79565 minichromosome maintenance deficient (S. 820 

458776 AV654978 Hs.19904 cystathionase (cystathionine gamma-lyase 8.19 

417687 AI828596 Hs250691 ESTs 8.18 

423218 NM-015896 Hs.167380 BLu protein 8.18 

425397 J04088 Hs.156346 topoisomerase (DNA) II alpha (170kD) 8.18 

50 406964 M21305 Hs247946 Human alpha satellite and satellite 3 ju 8.18 

402401 U42349 Hs.71119 Putative prostate cancer tumor suppresso 8.18 

423397 NMJXJ1838 Hs.1652 cbernokine ((>C mott) receptor 7 8.18 

427857 AL133017 Hs2210 thyroid hormone receptor interactor 3 - 8.17 

401519 8.17 

55 447188 H65423 Hs.17631 Homo sapiens cONA RJ201 18 fis, done CO 8.16 

424704 A1263293 Hs.152096 cytochrome P450, subfamily IIJ (arachido 8.16 

435854 AJ278120 Hs.4996 DKFZP564D1 66 protein 8.14 

448556 AW885606 Hs.5064 ESTs 8.14 

449217 AA278536 Hs23262 ribonudease, RNase A family, k6 8.14 

60 453124 AI139058 Hs23296 ESTs 8.14 

442812 AI018406 Hs.131284 ESTs 8.14 

421129 BE439899 Hs.89271 ESTs 8.14 
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TABLE 9A shows the accession numbers for those primekeys lacking a unigeneK) in Table 
9. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



408057 1035720_-1 AW139565 

408069 103655J H81795 Z42291 R20973 AA046920 

408182 104479J AA047854 M057506 AA053841 

1 052148J AW867079 AW867086 AW182772 

108463 J BE540279 AW410659 AA057857 R77693 BE278674 

409126 110159 J AA063426 AW962323 AW408063 AA063503 AA772927 AW753492BE175371 AA311147 

409292 111686J AA071051 AA070584 AA069938 AA102136 AA074430 

409314 111841J AA070266AA084967AA126998 

409385 112523 J AA071267 T65940 T64515 AA071334 

409398 1126716J AW386461 AW876408 AW386672 AW386599 AW876258 AW386619 AW386289AW876136 AW876203AW876213AW876301 

AW876295 AW876349 AW876365 AW876160 AW876369 AW876352 AW876271 

409671 114731J AA076769 AA076781 AI087968. 

409768 1 154035J AW499566 AW5Q2378 AW499522 AW502046 AW502671 AW501917 AW501868 AW501721 AW502813 

409841 1 156088J AW502139 AW502432 AW502235 AW501683 AW502647 

409842 1156119J AW501756AW502096 AW502465 AW501715 
409853 1156226 1 AW502327 AW502488 AW501829 AW502625 AW502687 
410531 1207200J AW752953 H88044 BE156092 

410688 1216101J AW796342 AW796356 BE1 61430 

410846 1223902.1 AW807057 AW807054 AW807189 AW807193 AW807369 AW807429 AW807364 AW807365 AW807078 AW807256 AW807180 
AW807331 

410896 1226053J AW809637 AW609697 AW810554 AW809707 AW809885 AW810000 AW810088 AW809742 AW809816 AW809749 AW809639 
AW809722 AW809836 AW809774 AW810023 AW810013 AW809813 AW809660 AW809728 AW809768 AW809951 AW809657 
AW809954 

411079 123128J AA091228 H71860 H71073 

411424 1245497J AWB45985 AW845991 AW845962 

41 1499 1248105J AW849292 AW849431 AW849422 AW849428 AW849420 AW849424 AW849427 

411507 1248607J AW850140 AW850195 AW850192 

411534 1248827.1 AW850473 AW850471 AW850431 AW850523 

411972 1268491J BE074959 AW880160 

412110 1277844J AW693569 AW893571 AW893588 AW893593 

412226 1284289 1 W26766 AW998612 AW902272 

412257 1285376J AW9Q3830 BE071916 

412405 1293012J AW948126 AW948139 AW948196 AW948145 AW948162 AW948134 AW948127 AW948124 AW948153 AW948157 AW948125 

AW948131 AW948158 AW948164 AW948151 

413260 1356003J BE075281 BE075219 BE075123 BE0751 19 BE075046 

413471 1371778J BE142098BE142092 

413729 1385114J BE159999 BE1 60056 BE1 60107 BE1 601 39 

414182 142409J AA1 36301 AI381776AA1 36321 

414989 1511339J T81668C19040C17569 

415354 1534763J F06495 R24336 R 13046 

416011 1566439J H14487R50911 Z43216 

416475 1596398J T70298 H58072 R02750 

417380 1672461J T06809N75735 

419392 1843934^-1 W28573 

419541 185724 J AW749617R64714AA244138 AA244137 BE094019 

419544 185760.2 AI909154 AA526337AA244193AI909153 

420819 196721J AA280700 AW975494 AA687385 

421245 200620 1 AA285363 AA285333 AA285359 AA285326 AA285350 

422673 219674J N59027 AA314694N53937 R08100 
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422695 
422858 
422940 
423730 
423790 
424385 
424606 
425265 



430676 



431180 
432093 
434596 
436357 
437159 
437495 
439097 
439120 
440134 
441896 
445629 
447229 
448064 
450783 
451045 
452549 
452560 



219996.1 
222209J 
223106.1 
231462.1 
232031J 
238731J 
241409.1 
249175J 
273830.-1 
32168.1 



326269.1 

328906.1 

341283.1 

38937 1 

41842.1 

43393.1 

43765.1 

46858.1 

46879.1 

48675.1 

52842.1 

645767J 

71288J 

74761.1 

84655.1 

85673.1 

921802.1 

922216.1 



452712 928309.1 

453758 980026.1 

454093 1007366.1 

454563 1224342.1 

454791 1234759.1 

454977 1247099.1 

455131 1254674.1 

455183 1259023.1 

455254 1266449.1 

455369 1285173.16 

455982 1396849.1 

456011 1410860.1 



456023 
457586 
457595 
457751 
459070 
459081 
459145 
459172 
459234 



1416335.1 

360505.1 

364225.-1 

399422.1 

883688.1 

889426J 

918957.1 

921 149 J 

945240 -1 



AA315158 AW961298 N76067 AW802759 AI858495 W04474 

R35398 BE252178 AA318153 

BE077458 AA337277 AA319285 

AA330214 AW962519 T547Q9 

BE152393 AA330984 BE073904 

AA339666 AW952809 AA349119 

AA343936 AA344060 AW963081 

BE245297 AA353976 AW505023 

BE262745 

AF084866 AF084870 AF084864 AF084867 AF084869 AF084865 AF084868 AW818206 AW812038 BE144813 BE144812 

AW812041 AW812040 AW812067 BE061583 BE061604 T05808 AI352469 AA580921 BE141783 BE141782 BE061601 

AW814393AW885029 

AW972830 AA527647 AA489820 AA570362 

H55883 AW971249 AA493900 H55788 

H28383 AW972670 H28359 AA525808 

T59538 T59589 T59598 T59542 AF147374 

AJ132085Z83805 

AL050072AW900148 

BE177778 BE177779 AL390180 AA359908 

H66948AF0S5954 H66949 

H56389AF085977 H56173 

BE410734 BE5601 17 BE270054 BE296330 BE267957 AI003007 BE545259 

AW891873 AW891897 BE564764 

AI245701 BE272724 

BE617135 AW504051 AW504283 

AA379036 AA150589 AI696854 BE621316 

BE266695 BE265474 N53200 BE267333 

AA215672 AI696628 AA013335 H86334 AA017006 

AI907039AI907081 

BE077084 AW139963 AW863127 AW806209 AW806204 AW806205 AW806206 AW806211 AW806212 AW806207 AW806208 
AW806210AI907497 

AW838616 AW838660 BE144343 AI914520 AW88891 0 BE1 84854 BE184784 
U83527 AL120938 U83522 

AW860158 AW862385 AW860159 AW862386 AW862341 AW821869 AW821893 AW062660 AW062656 
AW807530 AW807540 AW807537 AW846086 BE141634 AW846089 AW807499 AW807533 AW838499 
BE071874 BE071882 AW820782 AW821007 

AW848032 AW848630 AW848478 AW848623 AW848484 AW848169 AW848830 AW848149 AW8481 19 AW848893 AW848903 
AW848407 

AW857913 AW857916 AW857914 AW861627 AW861626 AW861624 
AW984111 AW863918 AW863856 

AW877015 AW877133 AW876978 AW877071 AW876988 AW877069 AW877063 AW877013 

AW903533 AW903516 AW903562 BE085202 BE085215 BE085214 BE085209 BE085172 BE085175 BE085193 BE085211 
BE085199 

BE176862 BE176876 BE176947 BE176878 

BE243628 BE246081 BE247016 BE241984 BE241534 BE246091 BE245679 BE243620 BE245998 BE242329 BE241417 

BE241457 BE242522 BE241989 BE241464 

R00028BE247630 

AW062439 AW751554 AA579463 

AA584854 

AI908236AA663731 

AI8143Q2A1814428 

W07808AI822066 

AI903354 AI903489 AI903488 

BE063380 BE063346 AI906097 

AI940425 
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TABLE 9B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 9. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the 

publication entitled The DNA sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposition 


400452 


8113550 


Minus 


90308-90505 


400557 


9801261 


Plus 


208453-208528,209633-209813 


400615 


9908994 


Plus 


1 1 8036-1 181 66,1 18681 -1 1 8807 


400802 


8567867 


Minus 


174571-174856 


400817 


8569994 


Plus 


170793-170948 


400880 


9931121 


Plus 


29235-29336,36363-36580 


400885 


9958187 


Minus 


58242-58733 


400926 


7651921 


Minus 


52033-52158,53956-54120 l 54957-55rj52,55420^ o 


400952 


7658481 


Plus 


192667-192826,194387-194876 


400991 


8096825 


Plus 


159197-159320 


401044 


8117619 


Plus 


73501-73674 


401124 


8570296 


Minus 


AfydAf>4 A rs A r\r\ 4 

124181-124391 


401163 


6981820 


Plus 


5302-5545 


401201 


9743387 


Minus 


1 38534-1 38629,1 39234-1 39294,1401 21 -140335,1 42033-142479 


401286 


9801342 


Minus 


147036-147318 


401384 


6850939 


Minus 


58360-58545 


401468 


6433826 


Plus 


13056-13482 


401515 


7630851 


Plus 


29929*30126 


401519 


6649315 


Plus 


157315-157950 


401672 


9838136 


Plus 


128526*128704,130755-1 30860 


401744 


2576349 


Pius 


4 Acne 4 

14595-1 4751 


401851 


7770425 


Minus 


4J£jI44 4.aggca -u-rTruf 1iT70-H -Moor-* -MCMOO i. AQfiOn. 4AQM 1 1/OQfH lAOQ/fQ 

146443-146664,147794-147971,148^1-14040^14^ 


401866 


8018106 


Plus 


73i2D-7oo2o 


402240 


7690131 


Plus 


lU43oc-iO-ra27,lUol3oMUoof£ 


402359 


9211204 


Minus 


40403-41961 


402585 


9908890 


Minus 


1 74o93- 1 7o05U,1o32 1 U-l oo-UD 


402788 


9796102 


PIUS 


QQ070 -irHytOn 






Minus 


53242-53432 


402812 


6010110 


Plus 


25026-25091,25844-25920 


402828 


8918414 


Pius 


69071-69642 


402835 


9187337 


Pius 


26961-27101 


402838 


9369121 


Minus 


32589-32735,35478*35666 


402842 


9369121 


Minus 


76355*76479 


402895 


9967547 


Plus 


85537-85671,86379-86469 


402964 


9581599 


Minus 


46624-46784 


403137 


9211494 


Minus 


92349-92572,92958-93084,93579-93712 


403237 


7637807 


Plus 


7271-7527 


403259 


7770585 


Plus 


4693^857 


403683 


7331517 


Plus 


217175-217446 


403690 


7387384 


Minus 


78627-79583 


403708 


5705981 


Minus 


134394-134812 


403838 


4176355 


Pius 


19197-19502 


403851 


7708872 


Plus 


22733-23007 


403976 


7657840 


Pius 


24755-24969 


404407 


7329316 


Minus 


48154-48499 


404426 


7407959 


Pius 


77842-77954 


404632 


9796668 


Plus 


45096-45229 


404741 


8574139 


Plus 


143025-143467 


404756 


7706327 


Plus 


82849-83627 


404946 


7382189 


Plus 


134445-134750 


405074 


7770440 


Plus 


44340-44559,44790-45059 


405125 


8247873 


Plus 


137113-137814 


405172 


9966752 


Plus 


153027-153262 
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405236 


7249076 


Minus 


151699-151915 


405325 


6094661 


Minus 


25818-26380 


405411 


3451356 


Minus 


17503-17778,18021-18290 


405495 


8050952 


Minus 


72182-72373 


405552 


1552506 


Pius 


4519945647 


405601 


5815493 


Minus 


147835-147935,149220-149299 


405685 


4508129 


Minus 


37956^38097 


405777 


7263187 


Minus 


104773-105051 


405856 


7653009 


Plus 


101777-102043 


405876 


6758747 


Plus 


39694-40031 


405932 


7767812 


Minus 


123525-123713 


405934 


6758795 


Plus 


159913-160605 


406006 


8247801 


Minus 


4264042776 


406134 


9163473 


Plus 


153291-153452 


406189 


7289992 


Minus 


22007-22234 


406422 


9256411 


Pius 


163003-163311 


406516 


7711422 


Minus 


128375-128449,128560-128784 


406538 


7711478 


Plus 


35196^35367^8229-38476^4008040216,43522-43840 


406554 


7711566 


Plus 


106956-107121 


406577 


7711730 


Plus 


11377-11509 
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TABLE 10: shows genes, including expression sequence tags differentially expressed in 
taxol resistant prostate tumor xenografts as compared to taxol sensitive prostate tumor 
xenografts. The genes are indicated as either being upregulated or downregulated during the 
induction of taxol resistance in sequential passages of the grafts. 



Pkey: 
ExAocn: 



Unigene Title: 
Eos: 

F00-F14: 



Unique Eos probeset identifier number 

Exemplar Accession number, Genbank accession number 

Unigene number 

Unigene gene title 

Internal Eos name 

passage number 



Pkey 


ExAocn 


UnigenetD UnlgenTttte Eos Resp,F0O 


F00 


F02 


F02 


F05 


F05 


F07 


F09 


F10 


F11 


F13 


F14 


117921 


N510Q2 


Hs.47170 UprinA2 PM28UP 1 


9 


8 


9 


32 


20 


34 


122 


105 


82 


71 


111 


112971 


T17185 


Hs.4299 ESTs CHA1 down 290 


281 


267 


335 


270 


284 


150 


157 


83 


89 


49 


75 


126645 


A1167942 


Hs.61635 STEAP PAA5down106 


111 


103 


71 


34 


67 


33 


14 


2 


1 


1 


1 


119018 


N95796 


Hs.179809 ESTs PAB2 down 765 


841 


757 


909 


742 


704 


478 


428 


253 


175 


228 


238 


110844 


N31952 


Hs.187531 ESTs PAV7down175 


192 


147 


141 


123 


129 


73 


65 


55 


48 


54 


84 


100654 


HG2841-H72969 Hs.75442 Albumin, A PM01 down 666 


605 


504 


728 


357 


445 


602 


187 


117 


127 117 113 


100655 


HG2841-HT2970 Hs.75442 Albumin, A PM02down 620 


653 


486 


686 


368 


386 


606 


175 


101 


95 115 97 


102076 


U09579 


HS2S2437 cycfin-dep PM03down 101 


94 


143 


190 


105 


107 


88 


40 


34 


31 


46 


22 


102208 


U22961 


Hs.75442 albumin PM04down495 


424 


323 


518 


252 


296 


467 


188 


169 


143 


165 


145 


103739 


AA075779 


mitochondr PMOSdown 75 


190 


606 


230 


378 


106 


218 


88 


69 


192 


69 


99 


107036 


AA599690' 


Hs.15725 SBB148 PM06down87 


124 


115 


188 


132 


111 


66 


71 


49 


70 


38 


50 


108242 


AA062746 


ESTs PM07down 14 


20 


252 


13 


22 


43 


193 


10 


10 


104 


21 


18 


108282 


AA065143 


solute car PMOSdown 27 


54 


178 


73 


108 


37 


53 


24 


14 


53 


15 


34 


108679 


AA115963 


beta-1-gb PM09down680 


693 


1292 656 


869 


389 


1 


74 


118 


662 


359 


409 


108731 


AA126313 


Hs.107476 ATPsynthaPMIOdownIO 


19 


185 


25 


60 


1 


32 


3 


7 


14 


1 


1 


110675 


H89355 


Hs.6598 adrenergic PM11 down 207 


334 


237 


239 


231 


220 


119 


145 


93 


64 


56 


124 


115412 


AA283804 


Hs.193552 ESTs PM12down146 


316 


282 


271 


340 


334 


115 


238 


100 


196 


83 


207 


115844 


AA430124 


Hs.234607 MDM2 PM13down49 


93 


94 


154 


132 


91 


23 


54 


23 


76 


14 


41 


120588 


AA281591 


Hs.16193 ESTs PM14down80 


157 


58 


141 


159 


127 


39 


83 


35 


37 


16 


46 


132349 


Y00705 


Hs.181286 serine pro PM15down 146 


217 


214 


150 


106 


126 


177 


85 


54 


63 


68 


56 


132888 


AA490775 


Hs.5920 N-acetyima PM16down 92 


150 


132 


178 


126 


139 


53 


94 


48 


67 


41 


60 


132967 


AA032221 


Hs.61635 STEAP PM17down224 


208 


203 


215 


205 


180 


132 


65 


68 


50 


48 


63 


133063 


AA283085 


Hs.64065 ESTs PM18down85 


148 


161 


150 


92 


108 


42 


99 


42 


65 


29 


126 


134374 


D62633 


Hs.8236 ESTs PM19down230 


240 


194 


212 


231 


189 


89 


123 


107 


95 


68 


91 


135400 


M23263 


Hs.99915 androgen r PM20down 36 


167 


99 


178 


132 


101 


23 


71 


26 


122 


14 


44 
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TABLE 11 : shows genes, including expression sequence tags that are up-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: 


Unique Eos probeset identifier number 






ExAccn: 


Exemplar Accession number, Genbank accession number 




UnigenelD: 


Unigene number 








Unigene Title: 


Unigene gene title 








R1: 


Background subtra< 


*ed normal proslatB : prostate tumor tissue 




Pkey 


ExAccn 


UnigenelD 


Unigene Title 


R1 


101336 


L49169 


Hs.75678 


FBJ murine osteosarcoma viral oncogene homolog B 


0.012 


130642 


M63438 


Hs.156110 


immunoglobulin kappa variable 1D4 


0.015 


133512 


X01677 


Hs.195188 


gfyceraldehyde-3-phosphate dehydrogenase 


0.017 


133436 


H44631 


Hs.737 


immediate earty protein 


0.017 


129292 


X13810 


Hs.1101 


POU domain; class 2; transcription factor 2 


0.019 


100610 


HG2566-HT4792 




Microtubule-Associated Protein Tau, Alt Spliced, Exon 8 


0.02 


133448 


M34516 


Hs.170116 


immunoglobuiin lambda-like polypeptide 3 


0.021 


125193 


W67577 


Hs.84298 


CD74 antigen [invariant polypeptide of major histocompatibility 










complex; class II arrtigen-associated) 


0.022 


133456 


T49257 


Hs.183704 


ubiqultin C 


0.022 


134546 


AA459310 


Hs.8518 


Homo sapiens mRNA; cDNA DKFZp586L1722 (from clone 










DKFZp586L1722) 


0.023 


102131 


U15085 


Hs.1162 


major histocompatibility complex; class II; DM beta 


0.023 


101375 


M13560 


Hs.84298 


CD74 antigen (invariant polypeptide of major histocompatibility 










complex; class I) antigen-associated) 


0.023 


100674 


HG3033-HT3194 




Spiiceosomal Protein Sap 62 


0.024 


134365 


R32377 


Hs.82240 


syntaxin3A 


0.027 


132335 


D60387 


Ks.189885 


ESTs 


0.027 


110303 


H37901 


Hs.32706 


ESTs 


O.028 


131678 


N59162 


Hs.30542 


ESTs 


0.028 


116599 


D80046 


Hs.250879 


ESTs 


0.029 


133769 


M17733 


Hs.75968 


thymosin; beta 4; X chromosome 


0.029 


107904 


AA026648 


Hs.61389 


ESTs 


0.03 


129427 


T80746 


Hs.111334 


ferritin; ight polypeptide 


0.03 


105987 


AA406631 


Hs.110299 


mitogen-activated protein kinase kinase 7 


0.03 


131466 


F03233 


Hs.27189 


ESTs 


0.032 


102859 


X00274 


Hs.76807 


Human HLA-DR alpha-chain mRNA 


0.032 


134626 


S82198 


Hs.8709 


caldecrin (serum calcium decreasing factor; elastase IV) 


0.032 


134170 


M63138 


Hs.79572 


cathepsin D (lysosomal aspartyl protease) 


0.033 
0.034 


131713 


X57809 


Hs.181125 


Immunoglobulin lambda gene cluster 


100748 


HG3517-HT3711 




Alpha-1-Antitrypsin, 5* End 


0.034 


118769 


N74496 




ESTs 


0.034 


111744 


R25375 


Hs.126916 


ESTs 


0036 


109221 


AA192755 


Hs.85840 


ESTs; Weakly similar to stac [H.sapiens] 


0.036 


133846 


AA480073 


Hs.76719 


U6 snRNA-associated Sm-iike protein 


0.036 


135281 


AA401575 


Hs.97757 


ESTs 


0.037 


119073 


R32894 


Hs.45514 


v-ets avian erythroblastosis virus E26 oncogene related 


0.037 


100760 


HG3576-HT3779 




Major Histocompatibility Complex, Class 11 Beta W52 


0.037 


101426 


M19483 


Hs.25 


ATP synthase; H+ transprtng; mitochndrl F1 complex; beta polypept 


0.038 


129568 


AA428Q25 


Hs.1 14360 


transforming growth factor beta-stimulated protein TSC-22 


0.038 


130900 


Z38468 


Hs.21036 


ESTs; Moderately similar to F25965_3 [H^apiens] 


0.039 


133879 


M13829 


Hs.77183 


v-raf murine sarcoma 3611 viral oncogene homolog 1 


0.039 


100627 


HG2702-HT2798 




Serine/Threonine Kinase (GbZ25424) 


0.039 


129424 


M55593 


Hs.1 11301 


matrix metailoproteinasd 2 (gelatinase A; 72kD gelatinase; 










72kD type IV collagenase) 


0.039 


128652 


AA621245 


Hs.103147 


ESTs; Weakly similar to similar to SP:YR40_BACSU [Celegans] 


0.039 


129979 


T72635 


Hs.13956 


ESTs 


0.039 


133468 


X03068 


Hs.73931 


major rustocompatibaity complex; class 11; DQ beta 1 


0.04 


102636 


U67092 




Human ataxia-telangiectasia locus protein (ATM) gene, exons 










1a, 1b, 2, 3 and 4, partial cds 


0.04 


129536 


M33493 


Hs.184504 


tryptase; alpha 


0.04 


133599 


M64788 


Hs.75151 


RAP1; GTPase activating protein 1 


0.041 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



102104 


U12139 




131340 


AM78305 


Hs25817 


130446 


X79510 


Hs.155693 


101352 


L77701 


Hs 16297 


122593 


AA453310 


Hs.128749 


130181 


R39552 


Hs 151608 


134071 


Z14093 


Hs.78950 


108129 


AA053252 


Hs.185848 


130511 


L32137 


Hs.1584 


133336 


AA291456 


Hs 71190 

1 IS./ 1 1 9V 


132982 


LG2326 


Hs 1981 18 


131880 


AA047034 


Hs^3816 


130540 


U35234 


Hs.159534 


133467 


AA256595 


Hs 73931 


101191 


L20688 




101860 




Ks371fi5 


102799 


U88898 




107200 


D20350 




101166 


L14927 


Hs-2099 


134289 


M54915 




135329 


AA436026 




1 24950 


TA37S6 


Uc 151 Ml 


1G2Q19 




no. loo/ ou 


100574 


Hfi297Q-HT2^75 






AA45ATW2 


Uc 953fin 


102fi75 


U72512 




131332 


R50487 


He 25717 


101634 


M57731 




113118 


T47906 


He 220512 




R77276 


Uc 120011 


130523 


W76097 


Hs 214507 


110244 


H26742 


He 25757 


131932 


AA454980 


Hs25R01 


132509 


H09751 


Hs503A 


133372 




Hs 72242 


100817 


HG4011-HT4flfU 




106746 


AA476436 


Hs7Q91 

flo.f 99 1 


135401 


L14813 


Ue 169271 


130479 

IOWt»9 


R44163 


He 12457 


102589 


U62015 


He flflR7 


121521 


AA412165 


Hs_9735A 


135340 


AA425137 


Hs 99093 


132336 


AA342422 


Hs.45073 


115368 


AA282133 




101278 


L38487 


Hs 110849 


103284 


X8Q200 


Hs8375 


100564 


HG2239-HT2324 




133132 


Z40883 


Hs 65588 


121811 


AA424535 


Hs 98416 


129613 


AA279481 


Hs-238831 


132468 


S79854 


Hs.49322 


120111 


W95841 


Hs.136031 


103668 


Z83741 


Hs.248174 


130386 


F10874 


Hs.234249 


104275 


C02170 


Hs.39387 


106305 


AA436146 


Hs.12828 


116431 


AA609878 


Hs.55289 


120339 


AA206465 


Hs.256470 


114427 


AA017063 




118821 


N79070 


Hs.94789 


118979 


N93798 


Hs.43666 


107495 


W78776 


Hs.90375 


120240 


Z41732 


Hs.66049 



Human alphal (XI) collagen (C0L1 1 A1) gene, 5' region and axon 1 

Homo sapiens chromosome 19; cosmid R27216 

protein tyrosine phosphatase; non-receptor type 21 

C0X17 (yeast) homolog; cytochrome c oxidase assembly protein 

alpha-methylacyf-CoA racemase 

Homo sapiens clone 23622 rnRNA sequence 

branched chain keto acid dehydrogenase E1 ; alpha polypeptide 

(maple syrup urine disease) 

ESTs; Weakly similar to 0 ALU SUBFAMILY J WARNING 
ENTRY !! [H^apiens] 

cartilage oligomeric matrix protein (pseudoachondroplasia; 

epiphyseal dysplasia 1 ; multiple) 

ESTs 

immunoglobulin lambda-like polypeptide 2 
RecQ protein-like 5 

protein tyrosine phosphatase; receptor type; S 
major histocompatibility complex; class II; OQ beta 1 
Rho GDP dissociation Inhibitor (GDI) beta 
collagen; type IX; alpha 2 

Human endogenous retroviral H protaase/tntegrase-derived 0RF1 
rnRNA, complete cds, and putative envelope prot rnRNA, partial cds 
ESTs 

Opocafin 1 (protein migrating faster than albumin; tear prealbumin) 

pim-1 oncogene 

ESTs 

protein phosphatase 3 (formerly 2B); catalytic subunit beta isoform 
(calcineurinAbeta) 



Triosephosphate Isomerase 
Homo sapiens clones 24718 and 24825 rnRNA sequence 
Human B-cell receptor associated protein (hBAP) alternatively 
spliced rnRNA, partial 3'UTR 
ESTs 

GR02 oncogene 
ESTs 
ESTs 
ESTs 

ESTs; WeaWy similar to ALR [H^apiens] 
chromodomain heGcase DNA binding protein 3 
neuropathy target esterase 
ESTs 

Dystrophin- Associated Glycoprotein, 50 Kda, AIL Splice 2 
ESTs 

carboxyl ester lipase-iike (bile salt-stimulated lipase-iike) 
Homo sapiens clone 23770 rnRNA sequence 
cysteine-rich; angiogenic inducer; 61 
EST 

Homo sapiens chromosome 19; cosmid R28379 
ESTs 



TNF receptor-associated factor 4 
Potassium Channel Proteirv(Gb21 1585) 
ESTs; Weakly similar to (JJ393P122 [H.sapiens] 
ESTs 

ESTs; Weakly similar to collagen alpha 1(XVIII) chain [M.muscu!us] 

deiodinase; iodothyronine; type III 

ESTs 

H2A histone family; member M 

mftogen-activatad protein kinase 8 interacting protein 1 

ESTs; WeaWy smlr to weak smlrity to ribosomal prot L14 (Cetegans] 

ESTs 

ESTs; Weakly smlr to 1 10 KD CELL MEMBRANE GLYCOPROTEIN [H^apiens] 
EST 

ESTs; Highly similar to Miz-1 protein [Hiapiens] 
ESTs 

protein tyrosine phosphatase type IVA; member 3 

ESTs 

ESTs 



0.041 
0.041 
0.042 
0.042 
0.042 
0.042 

0.042 

0.043 

0.043 
0.043 
0.044 
0.044 
0.044 
0.044 
0.044 
0.044 

0.044 
0.044 
0.044 
0.044 
0.044 

0.044 
0.044 
0.045 
0.045 

0.045 

0.045 

0.046 

0.046 

0.046 

0.046 

0.046 

0.046 

0.046 

0.046 

0.047 

0.047 

0.047 

0.047 

0.047 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.049 

0.049 

0.049 

0.049 

0.049 

0.049 

0.05 

0.813 

0.05 

0.05 

0.05 

0.05 

0.051 

0.051 
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114331 
130947 
129242 
131413 
112304 
101416 
131201 
101054 
101306 
129311 

129942 
119210 
101046 
114086 
110171 
101004 
129715 
101581 
113285 
127537 
100813 
101841 
135053 
101419 
119724 
102673 
129877 
114788 
123812 
117669 
123782 
102395 
133795 
123193 
132595 
104161 
115330 
112893 
133475 
128699 
102940 
131299 
102495 
129594 
118593 
126702 
124386 

130538 
114299 
115604 
106052 
131730 
131285 
129705 
123175 
103592 
118196 

104886 
104250 

113301 
110441 
125297 
135258 
130633 
112006 



Z41309 

R40037 

W81679 

AA482390 

R54798 

M17254 

AA426304 

K02405 

L41143 

T55087 

U95301 
R93340 
K01160 
Z38266 
H19964 
J04101 
N58479 



T66830 

AA569531 

HG3995-HT4265 

M93107 

R77159 

M17886 

W69468 

U72509 

AA248589 

AA156737 

AA620607 

N39237 

AA610111 

U41767 

M12529 

AA489228 

AA253369 

AA456471 

AA281145 

T08000 

L29217 

K03207 

X13956 

AA431464 

U51240 

R70379 

N69Q20 

U54602 

N27368 

M20786 

Z40782 

AA400378 

AA416947 

U05681 

AA479498 

X78706 

AA489010 

Z30644 

N59478 

AA053348 
AF000575 

T67452 

H50302 

Z39215 

AA292423 

T92363 

R42607 



Hs.12400 

Hs^1506 

Hs.5174 

HS26510 

Hs.26239 

Hs.45514 

Hs.24174 

Hs.73933 

Hs.232069 



Hs.144442 
Hs.92995 

Hs.12770 

Hs.31709 

Hs£48109 

Hs.12126 

Hs.1 98253 

Hs.1 82712 

Hs.162859 

Hs.76893 
Ha93678 
Hs.177592 
Hs.47622 

HS.1 3094 

Hs.103904 

HS.1 11591 

Hs.44977 

Hs.162695 

Hs.92208 

Hs.169401 

Hs.136956 

Hs.155742 

Hs.7724 

Hs.88827 

Hs.1 94684 

Hs.73987 

Hs.103972 

HS24998 

H&25426 

Hs.79356 

Hs.1 15396 

Hs.207689 

Hs.2785 

Hs.212414 

Hs.159509 

Hs.22920 

Hs.49391 

Hs.6382 

Hs.31210 

HS25274 

Hs.12068 

Hs.178400 

Hs.123059 

Hs.48396 

Hs.144626 
Hs.105928 

Hs.13104 

Hs.19845 

Hs.159409 

Hs.97272 

Hs.178703 

Hs.22241 



ESTs 
ESTs 

ribosomal protein S17 

ESTs; Modty smlr to vacuolar prot sorting homolog r-vps33b [R.norvegicus] 
ESTs 

v-ets avian erythroblastosis virus E26 oncogene related 
ESTs 

Human MHC class II HLA-DQ-beta mRNA (0R7 DQw2); complete cds 

T-cell leukemia translocation altered gene 

yb45c08 Jl Stratagene fetal spleen (#937205) Homo sapiens cDNA 

clone IMAGE74126 5', mRNA sequence. 

phospholipase A2; group X 

ESTs 

Accession not listed in Genbank 

Homo sapiens PAC clone DJ0777O23 from 7p14-p1 5 

ESTs 

v-ets avian erythroblastosis virus E26 oncogene homolog 1 

ESTs; Weakly similar to LR8 [H^apiens] 

major histocompatibility complex; class II; DQ alpha 1 

ESTs 

ESTs 

Cpg-Enriched Dna, Clone S19 

3-hydroxybutyrate dehydrogenase (heart; mitochondrial) 

ESTs 

ribosomal protein; large; P1 
ESTs 

Human alternatively spliced B8 (B7) mRNA, partial sequence 

ESTs; Weakly similar to ORF YGR101 w [S.cerevisiae] 

EST 

ESTs 

ESTs 

EST 

a disintegrin and metalloproteinase domain 15 (metargidin) 

apoiipoprotetn E 

ESTs 

gtyoxyiate reductase/hydroxypynrvate reductase 

KIAAQ963 protein 

ESTs 

bassoon (presynaptic cytomatrix protein) 
CDC-ffl© kinase 3 

profine-ricn protein BstNl subfamily 4 

Hu 12S RNA Induced by poly(ri); po!y(rC) and Newcastle disease vims 
ESTs; Weakly similar to unknown [H^apiens] 
Lysosornal-associated muttispanning membrane protein-5 
Human germiine IgD chain gene; C-reglon; C-defta-1 domain 
EST 

keratin 17 

sema domain; immunoglobulin domain (Ig); short basic domain; 

secreted; (semaphorin) 3E 

aIpha-2-plasmin inhibitor 

similar to S68401 (cattle) glucose induced gene 

ESTs 

ESTs; Highly similar to KIAA0612 protein [H^apiens] 
B-ceU CLL/Iymphoma 3 

ESTs; Modly smlr to putative seven pass transmembrane prot [H.sapiens] 



ESTs 

chloride channel Kb 

ESTs; Moderately similar to tumor necrosis factor-alpha 
-induced protein B12 [Rsapiens] 
growth differentiation factor 1 1 

leukocyte immunoglobulin* receptor; subfamily B (with TM 

and ITIM domains); member 3 

EST 

ESTs; Highly smlr to prot phosphatase 2A BR gamma subun'rt [H.sapiensJ 
ESTs 

ESTs; Weakly similar to dJ281 HQ2 [H^apiens] 
ESTs 

hypothetical protein 

196 



0.051 
0.052 
0.052 
0.052 
0.052 
0.052 
0.052 
0.052 
0.053 

0.053 
0.053 
0.053 
0.053 
0.053 
0.053 
0.053 
0.053 
0.053 
0.053 
0.054 
0.054 
0.054 
0.054 
0.054 
0.055 
0.055 
0.055 
0.055 
0.055 
0.055 
0.055 
0.055 
0.055 
0.056 
0.056 
0.056 
0.056 
0.056 
0.056 
0.056 
0.056 
0.057 
0.057 
0.057 
0.057 
0.057 

0.057 
0.057 
0.057 
0.057 
0.057 
0.057 
0.058 
0.058 
0.058 
0.058 

0.058 
0.058 

0.058 
0.058 
0.058 
0.058 
0.058 
0.058 
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30805 
34907 
32619 
35115 
00531 
24530 
19960 
32793 
01076 
30655 
34458 
05904 
32878 
21828 
33418 
29317 
30153 
24403 
27683 
29814 
31770 
17557 
03522 
20029 
02135 
23617 
12136 
33725 
02069 
06555 



U12194 
080002 
AA404565 
N35489 

HG1872-HT1907 

N62256 

W87533 



29375 
35271 
32958 
29364 
23427 
05236 
01012 
34791 
33700 
23887 
29363 
05719 
24226 
17437 

32741 
34437 
07664 
20844 
01574 
31219 
03495 
29607 
06467 
28641 
i00515 
19332 
34516 
35012 
03575 
15514 



L04270 

N92934 

AA192614 

AM01452 

AA026793 

AA425166 

U76366 

N46244 

D85815 

N31745 

AA668123 

W20070 



N33920 

Y10514 

W91960 

U15460 

AA609183 

R46100 

V00563 

U09196 

AA455000 

AA491226 

AA166837 

AA263028 

W79850 

AA397763 



10505 
33912 
29581 



AM77106 

AA598548 

AA219179 

J04444 

L18983 

K01396 

AA621065 

H05704 

AA291644 

H62396 

N27645 

AA394133 

M26041 

AA010594 

AA349417 

M34182 

C00476 

Y09022 

AA404594 

AA450040 

T16358 

HG1723-HT1729 

T54095 

AA171939 

X73608 

Z26256 

AA297739 

AA321355 
H55992 
X62744 
M33600 



Hs.170238 
Hs.178292 
Hs.53447 
Hs.94653 

Hs.102727 

Hs.32699 

Hs.56966 

Hs.1116 

Hs.17409 

Hs.83577 

Hs.32060 

Hs.58679 

Hs.98497 

Hs.172727 

Hs.110373 

Hs.15114 

Hs.102493 

Hs.134170 

Hs.168625 

Hs.31833 

Hs.44532 

HS250640 

Hs.41691 

Hs.181131 

Hs.9739 

Hs.179543 

Hs.82520 

Hs.16725 

Hs.105280 

Hs.72620 

Hs.1 11 076 

Hs.1 1081 

Hs.97562 

Hs.6147 

Hs.110757 

Hs.1 12471 

Hs.19105 

Hs.697 

Hs.89655 

Hs.75621 

Hs.1 12943 

Hs.110746 

Hs.36793 

Hs.1 90266 



Hs.55898 

Hs.198253 

Hs.5326 

Hs.96917 

Hs.158029 

Hs.24395 

Hs.153591 

Hs.1 1607 

Hs.154162 

Hs.106443 



Hs.23413 
Hs.93029 

Hs.55609 



Hs.20495 
Hs.77522 
Hs.180255 



sodium channel; voltage-gated; type I; beta polypeptide 0.058 

KIAA018O protein 0.058 

ESTs; Moderately similar to kinesin light chain 1 [Mmusculus] 0.058 

neurochondrin 0.058 

Major Histocompatibility Complex, Dg 0.058 

EST 0.058 

ESTs; Moderately similar to LIV-1 protein [Ksapiens] 0.058 

KIAA0906 protein 0.058 

lymphotoxin beta receptor (TNFR superfamily; member 3 0.058 

cysteine-rich protein 1 (Intestinal) 0.058 

cysteine and glycine-rich protein 3 (cardiac LIM protein) 0.058 

ESTs 0.059 

ESTs; Weakly similar to 4F2/CD98 light chain [M.musculus] 0.059 

ESTs 0.059 

Treacher Colfins-Franceschetti syndrome 1 0.059 

ESTs 0.059 

ras homotog gene family; member D 0.059 

ESTs 0.059 

ESTs 0.059 

KiAA0979 protein 0.059 

ESTs 0.06 

diubiquitin 0.06 

Ksapiens mRNA for CD152 protein ~ 0.06 

sequence-specific single-stranded-DNA-binding protein 0.06 

activating transcription factor B 0.06 

ESTs 0.06 

ESTs 0.061 

immunoglobulin mu 0.061 

Hu 1 .1 kb mRNA upregltd in retinotc acid treated HL-60 neutrophilic cells 0.061 

ESTs 0.061 

ESTs; Weakly similar to dJ963K235 [H.sapiens] 0.061 

DKFZP4341114 protein 0.061 

matate dehydrogenase 2; NAD (mitochondrial) 0.061 

ESTs; Weakly similar to HPBR1I-7 protein [H^apiens] 0.061 

ESTs 0.061 

KIAA1075 protein 0.061 

DNA segment on chromosome 21 (unique) 2056 expressed sequence 0.061 

ESTs 0.061 

translocase of inner mitochondrial membrane 17 (yeast) homotog B 0.061 

cytochrome c-1 0.062 

protein tyrosine phosphatase; receptor type; N 0.062 

protease inhibitor 1 (anMastase); aipha-1 -antitrypsin 0.062 

ESTs 0.062 

H sapiens HCR (a-helix coiled-coil rod homotogue) mRNA; complete cds 0.062 

ESTs 0.062 

ESTs 0.062 
yw5e3.s1 Weizmann Olfactory Epithelium H sapiens cONA clone 

1MAGE555676 3' smlr to contains L1.t3L1 repetitive element ;, mRNA seq 0.062 

ESTs; Highly similar to OASIS protein [M.musculus] 0.062 

major histocompatibBity complex; class H; DQ alpha 1 0.062 

ESTs; Moderately similar to plm-1 protein [Ksapiens] 0.062 

ESTs 0.062 

protein kinase; cAMP-dependent; catalytic; gamma 0.062 

small inducible cytokine subfamily B (Cys-X-Cys); member 14 (BRAK) 0.062 

Not56 (D. meIanogaster)-fike protein 0.062 

ESTs 0.062 

ADP-ribosylation factor-like 2 0.062 

ESTs 0.062 

Macrophage Scavenger Receptor, Alt Splice 2 0.062 
ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY II [H.sapiens] 0.062 

ESTs 0.062 

sparc/osteonectin; cwcv and kazaMike domains proteoglycan (testican) 0.063 

H^apiens isoform 1 gene for L-type calcium channel, exon 1 0.063 
ESTs; Weakly similar to ISOLEUCYL-TRNA SYNTHETASE; 
CYTOPLASMIC [Ksapiens] 

EST2393 Bone marrow Homo sapiens cONA 5' end, mRNA sequence 0.063 

DKFZP434F01 1 protein 0.063 

major histocompatibility complex; class II; DM alpha 0.063 

major histocompatibility complex; class (I; DR beta 1 0.063 
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130139 


R38280 


Hs.150922 


105B1 7 




nS.OOUf 


134o5o 




ns.i /ouuy 


1 00300 


LJ5Q435 


Ue OnCQO 

nS.oUoyo 


100277 


U42053 


Ue 7cflon 
nS./ooyo 


133116 


D61259 


Hs.6529 


134909 


A A m a a no 

AA521488 


Li _ n Anno 

Hs.90998 


130319 


X74794 


nS-15444o 


132057 


AA1Q2489 


HS.173484 


108334 


AA070473 




129763 


F10815 


Hs.1 2373 


135112 


T67464 


Hs.94617 


122269 


AA436856 


Hs.98910 


133082 


AA457129 


Hs.6455 


113213 


T58607 




106228 


AA429290 


Hs.17719 


130192 


Y12661 


Hs.171014 


104894 


AA054087 


Hs.18858 


103508 


Y10141 




128474 


U40671 


Hs.100299 


134012 


AA417821 


HS237924 


134536 


AA457735 


Hs.850 


111714 


R23146 


Hs.23466 


110521 


H57060 


Hs.108268 


103282 


X80198 


Hs.77628 


113921 


W80730 


Hs.28355 


129331 


N93465 


Hs.1 10453 


111316 


N74597 


Hs.1 80535 


135138 


AA036794 


Hs.95196 


107289 


T10792 


Hs.1 72098 


121405 


AA406083 


Hs.98007 


124965 


T16275 


Hs.106359 


106595 


AA456933 


Hs.1 74481 


100106 


AF015910 




134715 


AA282757 


Hs.89040 


135367 


AA480109 


Hs.9963 


111533 


R08548 


Hs.251651 


128509 


R53109 


H&247362 


101030 


J05037 


Hs.76751 


102753 


U80226 




126991 


R31652 


Hs.821 


109583 


F02322 


H&26135 


119241 


T12559 


Hs.221382 


130569 


AA156597 


tl Ar A i ij 

Hs-256441 


112926 


T10316 


Hs.4302 


120495 


AA256073 


Hs.190626 


130931 


AA276412 


HS21346 


129982 


M87789 


Hs.140 


133832 


H03387 


Hs.241305 


110697 


H93721 


Hs.20798 


121183 


AA400138 


Hs.97703 


130953 


U12707 


Hs.2157 


102218 


U24183 


Hs.75160 


114181 


Z39079 


Hs.8021 


lloool 




nS.oi:14o 


132498 


T87708 


Hs.50098 


103788 


AA096014 


Hs.9527 


102459 


U48936 




100373 


D79999 


Hs.77225 


132717 


AA203321 


Hs.151696 


128863 


D87462 


Hs.106674 


115193 


AA262029 


Hs.88218 


124558 


N66046 


Hs.141605 


117225 


N20392 


Hs.42846 


110665 


H83380 


Hs.32757 



BCS1 (yeast homo!og)-like 

synaptopodin 

ESTs 

transcription elongation factor A (SI I); 2 

sfte-1 protease (subtilisin-iike; sterol-regulated; cteaves sterol regulatory 

element binding proteins) 

ESTs 

K1AA0128 protein 

rninichromosome maintenance deficient (S. cerevisiae) 4 
ESTs 

zm7c8.s1 Stratagene neuroepithefium (#937231) Homo sapiens cONA 
clone IMAGE-5399 3*. mRNA sequence 
KIAA0422 protein 

ESTs; Weakly similar to predicted using Genefinder [Celegans] 
ESTs 

RuvB (E cdl homoIogHike 2 

ya94aQ2.s1 Stratagene placenta (#937225) Homo sapiens cONA clone 

IMAGES9290 3*. mRNA sequence. 

ESTs 

VGF nerve growth factor inducible 

phospholipase A2; group IVC (cytosolic; calclunvindependent) 

H^apiens 0AT1 gene, partial, VNTR 

(igase lil; DNA; ATP-dependent •=• 

ESTs; Highly similar to CGl-69 protein [H^apiens] 

IMP (inosine monophosphate) dehydrogenase 1 

ESTs 

ESTs 

steroidogenic acute regulatory protein related 
ESTs 

ESTs; Highly similar to CGI-38 protein [Rsapiens] 

ESTs; Weakly similar to mitogen Inducible gene mlg-2 [H.sapiBns] 

ESTs; Weakly similar to T20B12.3 [Celegans] 

ESTs 

ESTs 

ESTs 

ESTs 

Homo sapiens unknown protein mRNA, partial ods 
prepronodceptin 

TYRO protein tyrosine kinase binding protein 
EST 

dimethylarginine dimethyJaminohydroJase 2 



Human gamma-arninobutyric acid transaminase mRNA, partial cds 



ESTs 
ESTs 

EST; Moderately similar to CGI-136 protein [H.sapiens] 

ESTs 

ESTs 

ESTs; Weakly similar to F42C5.7 gene product [Celegans] 

immunoglobulin gamma 3 (Gm marker) 

estrogen-responsive B box protein 

ESTs 

ESTs 

Wiskott-Aldrich syndrome (ecezema-truorribocytopenia) 

phosphofructoklnase; muscle 

KIAA1058 protein 

ribosoma! protein S12 

ESTs 

ESTs; Highly similar to HSPC013 [H.saplens] 

Human amitoride-senshrve epithelial sodium channel gamma subunft mRNA, 

5* end, partial cds 

ADP-ribosyltransferase (NAD+; poly (ADP-ribose) porymerase)-like 1 
DKFZP727G051 protein 

BRCA1 associated protein-1 (ubiquitin carboxy-terminal hydrolase) 

ESTs 

ESTs 

ESTs 

ESTs 



0.064 
0.064 
0.064 
0.064 

0.064 
0.064 
0.064 
0.064 
0.064 

0.064 
0.064 
0.064 
0.064 
0.064 

0.065 
0.065 
0.065 
0.065 
0.065 
0.065 
0.065 
0.065 



0.065 
0.065 
0.065 
0.065 
0.065 
0X)65 
0.065 
0.065 
0.065 
0.066 
0.066 
0.066 
0.066 
0.066 
0.066 
0.066 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.068 
0.066 

0.068 
0.068 
0.068 
0.068 
0.068 



0.069 
0.069 
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132905 


U70663 


H $.182965 


KruppeMike factor 4 (gut) 


105778 


AA348910 


Hs. 153299 


DOM-3 (C. etegans) nomolog Z 


134770 


R72079 


Hs.89575 


CD79B antigen ftmmunogtobu (in-associated beta) 


123097 


AA485869 


Hs. 105671 


ESTs 


100750 


HG3523-HT4899 




Proto-Oncogene C-Myc, Alt Splice 3 f On 114 


125091 


T91518 




ye20rQ5.s1 Stratagene rung (#937210) n sapiens cuna clone iMAuc. 

3* similar to contains Alu repetitive element;contains MER12 repetitive element; 

mRNA sequence. 


100756 


HG3565-HT3768 




Zinc Finger Protein (Gb:M88357) 


113483 


T87768 


Hs.1 6439 


ESTs 


101119 


L09708 


Hs2253 . 


complement component 2 


102286 


U31628 


Hs.12503 


interieukin 15 receptor; alpha 


135349 


D83174 


Hs.9930 


coilagen-brnaing protein 2 (coiligen 2) 


100991 


J03764 


Hs.82085 


plasminogen activator inhibitor; type 1 


133675 


AA443720 


Hs.7551 


ESTs; Weakly similar to T25G3.1 [Celegans] 


105422 


AA251014 


Hs.12210 


ESTs 


102932 


X13334 


HsJ5627 


C014 antigen 


119147 


R58878 


Hs.65739 


ESTs 


104900 


AA055048 


Hs.180481 


EST s; Weakly similar to ACROSIN PRECURSOR [KsapiensJ 


133185 


AA481404 


Hs.6686 


ESTs 


115496 


AA290674 


Hs.71819 


eukaryotic translation initiation factor 4E binding protein 1 


121005 


AA398332 


Hs.97613 


ESTs 


124869 


R69088 


Hs.28728 


ESTs; Weakly similar to F55A12S [Celegans] ~* 


129154 


N23673 


Hs.108969 


mannosidase; alpha; class 2B; member 1 


112161 


R48295 




ESTs; Wklysmlr to II ALU SUBFAMILY J WARNING ENTRY I! [H^apiens] 


125251 


W87486 


Hs.141464 


ESTs 


134298 


J00116 


Hs.81343 


collagen; type II; alpha 1 (primary osteoarthritis; spondyloepiphyseal 
dysplasia; congenital) 


119745 


W70264 


Hs.58093 


ESTs 


131306 


AA232686 


Hs.25489 


ESTs 


107776 


AA018820 


Hs.221147 


ESTs 


134271 


AA199630 


Hs.184456 


ESTs; WWy smlr to It ALU SUBFAMILY SX WARNING ENTRY 11 tfisapiens] , 


101798 


M85220 




Accession not listed in Genbank 


135402 


S76942 


Hs.99922 


dopamine receptor D4 


118742 


N74052 


Hs.50424 


EST 


131867 


N64656 


Hs.3353 


Homo sapiens clone 24940 mRNA sequence 


102923 


X12517 


Hs.1063 


small nuclear ribonucleoprotein polypeptide C 


100775 


HG371-HT26388 




Mucin 1, Epithelial, Ail Splice 9 


111020 


N54361 


Hs.105726 


ESTs 


134224 


X80822 


Hs.163593 


ribosomal protein L18a 


124059 


F13673 


Hs.99769 


ESTs 


133972 


AA160743 


Hs.78019 


Homo sapiens clone 24432 mRNA sequence 


129681 


AA436009 


Hs.178186 


ESTs; WeaWy similar to WASP-famfly protein [H^apiensJ 


103065 


X58399 


Hs.81221 


Human L2-9 transcript of unrearranged immunoglobulin V(H)5 pseudogene 


124966 


T19271 


Hs.155560 


cahexin 


112270 


R53Q21 


Hs203358 


ESTs 


116704 


F10183 


Hs.66140 


EST 


129890 


M13699 


Hs.1 11461 


ceruloplasmin (ierroxidase) 


127345 


AA972008 


Hs.1 66253 


ESTs; Highly similar to KIAA0476 protein [H^apiens] 


112436 


R63090 


Hs£8391 


ESTs 


114531 


AA053033 


Hs£0333G 


ESTs 


135122 


H99080 


Hs.94814 


ESTs 


103934 


AA281338 


Hs.134200 


Homo sapiens mRNA; cDNA DKFZp564C186 (from done DKFZp564C186) 


109363 


AA215369 


Hs.185764 


ESTs; Weakly similar to hypothetical protein [KsapiensJ 


112647 


R83329 


Hs.33403 


ESTs 


127083 


244079 


Hs.91608 


otoferiin 


133027 


AA402624 


Hs.63236 


synuclein; gamma (breast cancer-specific protein 1 ) 


122086 


AA432121 


HsJ250986 


EST 


110405 


H47542 


Hs.33962 


ESTs 


128697 


AB0Q2344 


Hs.103915 


KIAA0346 protein 


112221 


R50380 


Hs.25670 


ESTs 


100478 


HG1067-HT1067 




Mucin (Gb:M22406) 


115598 


AA400129 


Hs.65735 


ESTs 


132491 


AA227137 


Hs.4984 


KIAA0828 protein 


101655 


M60299 




Human alpha-1 collagen type II gene, exons 1 , 2 and 3 


106018 


AA411887 


Hs.34737 


ESTs 


129683 


W05348 


Hs.158196 


DKFZP434B103 protein 


134137 


F10045 


HsJ9347 


KIAA0211 gene product 


114008 


W89128 


Hs.19872 


ESTs 



0.069 
0.069 
0.069 
0.069 
0.069 



0.069 

0.069 

0.069 

0.069 

0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0X7 

0.07 

0.07 

0.07 

0.07 

0.071 

0.071 

0.071 

0.071 

0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0X72 
0.072 
0.072 
0.072 
0X72 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.073 
0X73 
0.073 
0X73 
0.073 
0.073 
0.073 
0X73 
0.073 
0.073 
0X73 
0.073 
0X73 
0X73 
0X73 
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107653 
104798 
134082 
119180 
107741 
133683 
111694 
120764 
119389 
100929 
119388 
133019 
105185 
133413 
101017 
132865 
110882 
129197 
101184 
134910 
119411 
102000 
114691 
134179 
134503 

129719 
113916 
113897 
129697 
112078 
121980 
100898 
121626 
133670 
131879 
100254 
133194 
106081 
115544 
119955 
104407 
135019 
114815 
119471 
117788 
119406 
130777 
130494 
104107 
121483 
104451 
118027 
109419 
115783 
110585 
123165 
103966 
109549 
106730 
120310 
104078 
117624 
112421 
106958 
129984 
122044 
123280 
115710 



AA010210 

AAQ29462 

L16991 

R80413 

AA016982 

AA335223 

R22035 

AA338729 

T88826 

HG688-HT688 

T88798 

AF009674 

AA191495 

S72043 

JQ4599 

K02765 

N36001 

T90303 

L19871 

AA431320 

T96621 

U01824 

AA121893 

U53204 

U34880 

N66396 

W80464 

W73926 

R00841 

R44155 

AA429886 

HG4638-HT5050 

AM16974 

AA243416 

AA017161 

D38037 

AA291726 

AM18394 

AA351433 

W87460 

H61361 

X58431 

AA161488 

W31352 

N48292 

T95064 

R61742 

L13197 

AA424111 

AM11981 

M13299 

N52770 

AA227560 

AM24487 

H62223 

AA488863 

AA303166 

F01528 

AA465520 

AA193676 

AA402801 

N35978 

R62441 

AA497026 

W92811 

AA431456 

AA491285 

AA412535 



Hs.47041 

Hs.17235 

Ks.79006 

Hs.92520 

Hs.64341 

Hs.75558 

Hs.23331 

Hs.133096 

Hs.90973 



Hs.184434 

Hs.189937 

Hs.73133 

Hs.821 

Hs.251972 

Hs.17348 

Hs.109308 

Hs.460 

Hs.9100 

Hs.203656 

Hs.380 

Hs.103779 

Hs.79706 

Hs.84183 

Hs.167766 

Hs.31928 

Hs.4947 

Hs.172069 

Hs.1 12218 

Hs.110407 

Hs.98174 
Hs.75470 
Hs.33792 
Hs.77643 
Hs.67201 
Hs.25354 
Hs.66187 



Hs.102171 

Hs.98428 

Hs.103931 

Hs.55445 

Hs.46849 

Hs.193771 

Hs.256554 

Hs.75874 

Hs.12598 

Hs.25274 

Hs.102119 

Hs.75968 

Hs.86987 

Hs.72289 

Hs.133526 

Hs.105216 

Hs.127270 

Hs.21192 

Hs.22313 

Hs.1 18926 

Hs222010 

Hs.82364 

Hs£3127 

Hs£2059 

Hs.183927 

Hs.98736 

Hs.175144 



ESTs 0.073 

ESTs 0.073 

deoxythyrnidylate kinase 0.073 

ESTs 0.073 

ESTs 0.073 

pepsinogen 5; group I (pepsinogen A) 0.073 

ESTs 0.073 

ESTs 0.073 

ESTs 0.074 

Major Histocompatibility Complex, Class li, Dr Beta 2 (GbiX65561 ) 0.074 

plasminogen activator inhibitor, type I 0.074 

axin 0.074 

ESTs 0.074 

metallothionein 3 (growth inhibitory factor (neurotrophic)) 0.074 

biglycan 0.074 

complement component 3 0.074 

ESTs; WWy smlr to !! ALU SUBFAMILY SQ WARNING ENTRY !! [H^apiens] 0.074 

ESTs; WWy smlr to leucme-rich glioma-inactivated prot precursor [H^apiens] 0.074 

activating transcription factor 3 0.075 

ESTs 0.075 

EST 0.075 

solute carrier family 1 (glial high affinity gtutamate transporter); member 2 0.075 

ESTs; Weakly similar to envelope protein [H^apfens] 0.075 

plecttn 1 ; intermediate filament binding protein; 500kD 0.075 
dtptheria toxin resistance protein required for dlphthamide 

biosynthesis (Saccharomyces)-Iike 1 0.075 

ESTs; Moderately similar to Pro-a2(XI) [H^apiens] 0.075 

ESTs; Wkly smlr to alternatively spliced product using exon 13A [H^apiens] 0.075 

ESTs 0.075 

DKFZP434C212 protein 0.075 

ESTs 0.075 
ESTs; Weakly similar to coded for by C. elegans cDNA ykl 73c12.5 {Celegans] 0.075 

Spliceosomal Protein Sap 49 0.075 

ESTs 0.075 

hypothetical protein; expressed in osteoblast 0.075 

ESTs 0.075 

FK5Q6-binding protein 1B (12.6 kD) 0.075 

ESTs 0.075 

ESTs 0.075 

Homo sapiens clone 23700 mRNA sequence 0.076 

ESTs 0.076 

immunoglobulin superfamily containing leudne-rich repeat 0.076 

Human Hox£2 gene for a homeobox protein 0.076 

DKF2P434B0335 protein 0.076 

ESTs 0.076 

ESTs 0.076 

EST 0.076 

ESTs 0.076 

pregnancy-associated plasma protein A 0.076 

T-cell lymphoma invasion and metastasis 2 0.076 

ESTs; Modly smlr to putative seven pass transmembrane prot [H^apiens] 0.076 

blue cone pigment 0.076 

thymosin; beta 4; X chromosome 0.076 

receptor-interacting serine-threonine kinase 3 0.076 

ESTs; Weakly similar to UV-1 protein [H^apiens] 0.076 

ESTs; Wkly smlr to fllALU SUBFAMILY SB1 WARNING ENTRY ^.sapiens] 0.076 

ESTs; Weakly smlr to IIALU SUBFAMILY J WARNING ENTRY !! [Rsapiens] 0.077 

ESTs 0.077 

Homo sapiens clone 25155 mRNA sequence 0.077 

ESTs 0.077 

DKFZP586K0919 protein 0.077 

ESTs 0.077 

ESTs 0.077 

ESTs 0.077 

ESTs 0.077 
ESTs; Weakly similar to !l ALU SUBFAMILY J WARNING ENTRY I! [Ksaplens] 0.077 

EST 0.077 

ESTs 0.077 
sphingomyelin phosphodiesterase 2; neutra 
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134129 D87444 HsJ9305 

129321 AA224502 Hs.206501 

130513 AA460257 Hs.15866 

5 100996 J03909 Hs.14623 

128358 AI095718 Hs.135015 

128544 R59352 Hs.119273 

106040 AA4126B1 Hs.125139 

106495 AA452113 Hs.32454 

10 131833 R40899 Hs.32973 

119219 R97176 Hs.1 10783 

135415 X60655 Hs.99967 

109457 AA232646 Hs.68061 

117137 H96670 Hs.42221 

1 5 107094 AA609614 Hs.5241 

130165 T90529 Hs.251613 

124072 H05252 Hs.101637 

126151 AA324743 Hs.40808 

119035 R01779 Hs.7740 

20 110157 H18987 Hs.169731 

128515 AA149044 Hs.10086 

133069 U94836 Hs.6430 

112209 R49644 Hs.24865 

133361 R28279 Hs.71848 

25 134714 U89922 Hs.890 

129905 T86796 Hs.132875 

120421 AA236166 Hs.132957 

100885 HG4490HT4876 

102789 U86759 Hs.158336 

30 120139 Z39273 Hs.77876 

135238 U76343 Hs.96970 

129618 N54S45 Hs.173030 

132960 AA609742 Hs.6150 

108751 AA127063 Hs.203717 

35 134060 D42039 Hs.78871 

111338 N79778 Hs.35094 

112345 R56880 Hs.26563 

126456 W00881 

40 128937 Z39939 Hs.10726 

103485 Y08409 Hs.248415 

111202 N68280 Hs.107922 

132625 AA429890 Hs.166066 

103434 X98085 Hs.54433 

45 102616 U65581 Hs.159191 

102667 U70867 Hs.83974 

111422 R01127 Hs.19104 

101411 M16938 Hs*20 

113267 T65058 Hs.12725 

50 103559 Z19585 HsJ5774 

131588 AA258613 Hs29189 

107821 AA020991 Hs.172856 

134278 H82839 Hs.81001 

120893 AA369800 Hs.97058 

55 108786 AA128999 

106690 AA489245 Hs.88500 

119760 W72267 Hs.58219 

132999 Y00787 Hs.624 

60 129156 AA028195 Hs.108973 

121171 AM00008 Hs.161814 

103864 AA207264 Hs.181077 

128591 AA255537 Hs.102057 

122172 AA435753 Hs.161854 

65 112802 R97647 Hs.174855 

107723 AA015967 Hs.60680 

113011 T23737 Hs.1600 

131279 AA089853 Hs.25197 

103190 X70083 Hs.58414 



I membrane (neutral sphingomyelinase) 0.077 

K1AA0255 gene product 0.077 

Homo sapiens clone 643 unknown mRNA; complete sequence 0.078 

ESTs 0.078 

Interferon; gamma-inducible protein 30 0.078 

ESTs 0.078 

KIAA0296 gene product 0.078 

ESTs 0.078 

ESTs; Moderately similar to KIAA0544 protein [H^apiens] 0.078 

glycine receptor; beta 0.078 

ESTs 0.078 

even-skipped homeo box 1 (homolog of Drosophila) 0.078 

ESTs; Weakly similar to sphingosine kinase [M.musculus] 0.078 

ESTs 0.078 

ESTs 0.078 

EST 0.078 

EST; Weakly similar to hypothetical protein [H.sapiens] 0.078 

ESTs 0.078 

ESTs 0.078 

ESTs 0.078 

ESTs; Highly similar to HYPOTHETICAL PROTEIN KIAA0195 [Rsapiens] 0.078 

protein with polyglutamine repeat 0.078 

ESTs A 0.078 

Human clone 23548 mRNA sequence 0.078 

tymphotoxin beta (TNF superfamily; member 3) 0.078 

ESTs; Weakly similar to predicted using Genefinder [Cetegans] 0.079 

ESTs; Weakly similar to chondromodulin-l precursor [H^apiens] 0.079 

Proline-Rich Protein Prb4, Allele 0.079 

netrin2(chicken)-lilcB 4 0.079 

Human DNA from chromosome 19-specific oosmid R30923; genomic sequence 0.079 

Human liver GABA transport protein mRNA; 3* end 0.079 

ESTs 0.079 

KIAA0521 protein 0.079 

ESTs 0.079 

K1AA0081 protein 0.079 

extracellular matrix protein 2; female organ and adipocyte specific 0.079 

ESTs 0.079 
za56d02j1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA done 

IMAGE296547 5\ mRNA sequence. 0.079 

ESTs 0.079 

thyroid hormone responsive SPOT14 (rat) homolog 0.079 

ESTs 0.079 

dsplatin resistance associated 0.079 

tenascin R (restrict; janusin) 0.079 

ribosomal protein L3-like 0.079 

solute carrier family 21 (prostaglandin transporter); member2 0.079 

ESTs 0.079 

homeo boxCS 0.08 
ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY it [H^apiens] 0.08 

thrombospondin 4 0.08 

KIAA1021 protein 0.08 

ESTs 0.08 

ESTs; Weakly similar to DY3.6 [C.elegans] 0.08 

EST; Highly similar to CMP-N-acetytneuraminic acid hydroxylase [H^apiens] 0.08 
zo8f12.s1 Stratagene neuroeplmelium NT2RAMI 937234 Homo sapiens 

cONA clone IMAGE5671 19 3 1 , mRNA sequence 0.08 

KIAA1066 protein; JSAP1 homolog (mouse); JIP3 homolog (mouse) 0.08 

ESTs 0.08 

{nterleuktn 8 0.08 

dofichyl-phosphate mannosyttransferase polypeptide 2; regulatory subunit 0.08 

ESTs 0.08 

ESTs; Weakly similar to Miller-Dieker lissencephaly gene [H.sapiens] 0.08 

ESTs; Weakly similar to Mnked GlcNAc transferase [H.sapiens] 0.08 

EST 0.08 

EST 0.08 

EST 0.08 

chaperonln containing TCP1; subunit 5 (epsllon) 0.081 

STIP1 homology and U-Box containing protein 1 0.081 

filamin C; gamma (actjn-binding proteln-280) 0.081 



201 



WO 02/30268 PCT/USO 1/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



103956 


A A ft Art 14 -1 

AA292411 


Hs.233348 


ESTs 


U.Uol 


112706 


R89828 


1 f_ J A A JAM 

Hs.1 38493 


ESTs 


A AO* 
Q.Ufll 


126126 


M85370 




EST01684 Fetal brain, stratagene (cawSdo^uo) norno sapiens cuna 
clone HFBCH10, mRNA sequence. 


rt ao4 

0.081 


130094 


H43286 


Hs. 16701 7 


gamma-amlnobutyric acid (GABA) B receptor; 1 


n nai 
UJJol 


100800 


HG3945-HT4215 




Phospholipid Transfer Protein 


a new 

0.081 


108675 


M1 15240 


Ks.61816 


ESTs 


rt AQ4 


129420 


AA234259 


Hs.99816 


ESTs 


A f><H 

0.081 


129666 


M77349 


Hs.1 18787 


transforming growth factor; beta-induced; 68kD 


A ACM 

0.081 


101645 


M59807 


Hs.943 


natural killer cell transcript 4 


A non 

0.081 


130536 


T17045 


Hs.159492 


spastic ataxia of Chartevoix-Saguenay (sacsin) 


a noi 

0.081 


107732 


AA016181 


HS59752 


ESTs 


0.081 


123071 


AA482593 


Hs.1 04285 


ESTs 


0.0O1 


113537 


T90457 


Hs.1 91293 


ESTs 


0.081 


101250 


L34060 


Hs.79133 


cadherin 8 


U.Uol 


122521 


AA449433 


Hs.1 49227 


ESTs; Weakly similar to PROUNE-RICH PROTEIN MP-3 (M.musculusj 


0.081 


133914 


N32811 


Hs.77542 


ESTs 


0.081 


102038 


U05659 


Hs.477 


hydroxysteroid (1 7-beta) dehydrogenase 3 


0.081 


110336 


H40338 


Hs.174094 


ESTs; Weakly similar to !l ALU SUBFAMILY J WARNING ENTRY I! [H.sapiens] 


| 0.081 


118637 


N70274 


Hs.49822 


ESTs 


0.081 


117966 


N51589 


Hs.94012 


ESTs 


0.082 


104424 


H87671 


Hs.182320 


ESTs; Weakly similar to Mouse 195 mRNA; complete cds [M.muscufus] 


0.082 


100361 


D78361 


Hs.125078 


Human mRNA for ornithine decarboxylase antizyme; ORF 1 and ORF 2 


0.082 


112974 


T17291 


Hs.101174 


microtubule-associated protein tau 


0.082 


132832 


D63482 


Hs.57734 


KfAA0148gene product 


0.062 


132039 


Z39489 


H&3781 


Homo sapiens BAC clone RG118D07from7q31 


0.082 


113272 


T65383 


Hs.12807 


ESTs 


0.082 


104924 


AA058532 


Hs£8774 


ESTs 


0.082 


111061 


N58054 


Hs.36859 


ESTs 


0.062 


129269 


R45977 


Hs.163593 


ribosomal protein L18a 


0.082 


102453 


U48437 


Hs.74565 


amyloid beta (A4) precursor-like protein 1 


0.082 


126204 


AI080388 


Hs.134296 


ESTs 


0.082 


116615 


D80666 


Hs.45203 


ESTs 


0.082 


128856 


AA219552 


H&204144 


ESTs; Modry smir to tumor necrosis factor-alpha-induced prot B12 [H.sapiens] 


0.082 


112776 


R95850 


Hs.34494 


ESTs 


0.082 


105494 


AA256273 


H&29288 


Homo sapiens mRNA; cONA DKFZp434P174 (from clone DKFZp434P174) 


0.082 


117000 


H84718 


Hs.1 12236 


ESTs; Weakly similar to repressor protein [H^apiens] 


0.082 


112656 


R85260 


Hs.133151 


transient receptor potential channel 7 


0.082 


126963 


J03890 


Hs.1074 


surfactant; pulmonary-associated protein C 


0.083 


116957 


H79292 


Hs.39960 


ESTs 


0.083 


101057 


K03430 




Human complement C1q B-chaln gene, exon A+1 


0.083 


121948 


AA429452 


Hs.98582 


ESTs 


0.083 


130822 


M80647 


Hs.2001 


thromboxane A synthase 1 (platelet; cytochrome P450; subfamily V) 


0.083 


122743 


AA458674 


Hs.99478 


EST 


0.083 


114569 


AA063316 




zm2d1 .s1 Stralagene corneal stroma (#937222) Homo sapiens cONA clone 
IMAGE;512947 3' similar to TR.E198281 E198281 THIOREDOXIN 
REDUCTASE ;contains Atu repetitive element;, mRNA sequence 


0.083 


132270 


U70671 


Hs.43509 


ataxin 2 related protein 


0.083 


108126 


AA052951 


Hs.47413 


ESTs 


0.083 


102880 


X04325 


H&2679 


gap junction protein; beta 1; 32kD (connexin 32; Charcot-Marie-Tooth 
neuropathy; X-linked) 


0.083 


115365 


AA282089 


H&88599 


ESTs 


A AQO 

O.Ooo 


114529 


AA052980 


HS.2067Q4 


ESTs 


A AOO 

0.093 


135017 


AA249586 


Hs.9315 


ESTs; Weakly similar to NEURONAL OLFACTOMEDIN-RELATED 
ER LOCALIZED PROTEIN [H.sapiens] 


0.083 


123776 


AA610071 


Hs.1 12813 


ESTs 


A AOO 

0.083 


114454 


AA021091 


Hs.226208 


ESTs 


n aqq 


101246 


L33799 


Hs.202097 


procollagen C-endopepttdase enhancer 


U.UOO 


107366 


U78310 


Hs.13501 


pescadillo (zebrafish) homolog 1; containing BRCT domain 


0.083 


132779 


T89601 


Hs.95497 


ESTs; Weakly similar to GLUCOSE TRANSPORTER TYPE 5; 


0.083 








SMALL INTESTINE [H^apiens] 


129709 


AA1 12209 


Hs.1209 


acyi-Coenzyme A dehydrogenase; long chain 


0.083 


115244 


AA278767 


Hs.914 


Human mRNA for SB classJI histocompatibility antigen alpha-chain 


0.083 


123253 


AA490878 


Hs.111334 


ferritin; light polypeptide 


0.083 


128469 


T23724 


Hs.258677 


EST 


0.083 


132220 


AA431847 


Hs.42409 


ESTs; Highly similar to CGM46 protein [Haptens] 


0.083 


111664 


R17939 


HS22344 


ESTs 


0.083 


102354 


U38268 




Human cytochrome b pseudogene, partial cds 


0.084 


112828 


R98774 


Hs.194338 


ESTs 


0.084 



202 



WO 02/30268 



PCTAJS01/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



110410 


H47868 


Hs.34024 


102620 


U66052 




102550 


U58087 


Hs.14541 


108417 


AA075716 




113299 


T67285 


Hs.13089 


117869 


N49947 


Hs.46990 


113734 


T98484 


Hs.18377 


133325 


C00424 


Hs.7101 


123368 


AA505022 


Hs.124838 


101615 


M55153 


Hs.8265 


119352 


T65972 


Hs.193365 


123828 


AA620686 


Hs.1 12884 


103611 


Z38133 


Hs.1 13973 


131289 


AA485697 


Hs.25334 


128678 


T15896 


Hs.103535 


130814 


AA256695 


Hs.19813 


133391 


X57579 


Hs.727 


129322 


AA437153 


Hs.1 10407 


109284 


M196995 


Hs.86092 


116689 


F09222 


Hs.66099 


100545 


HG2147-HT2217 




102634 


U66711 


Hs.77667 


111735 


R25389 


Hs^3856 


105181 


AA190676 


Hs.10974 


122681 


AA455350 


Hs.99401 


114543 


AA056121 


Hs.158419 


133597 


AA425908 


Hs.75139 


121064 


AA398647 


Hs.97406 


122231 


AA436369 


Hs.197728 


100309 


D50550 


Hs.95659 


101727 


M73481 


Hs.73883 


131226 


AA165400 


H&24476 


133580 


AA095041 


Hs.181073 


102792 


U87964 


HJL227576 


104976 


AA086480 


Hs.183669 


120865 


AA350631 . 


Hs.96963 


106080 


AA416046 


Hs.35124 


128571 


AA416619 


Hs.101661 


101838 


M92934 


Hs.75511 


128514 


H84261 


Hs.100843 


123099 


AA465931 


Hs.79 


134067 


Y08200 


Hs.78920 


116967 


H80336 


Hs.40124 


110053 


H12586 


Hs.89563 


114395 


AA007313 


Hs.110155 


107465 


W44681 


Hs.251385 


101983 


S85655 


Hs.75323 


112544 


R70948 


Hs.29153 


111423 


R01165 


Hs.1 88507 


127918 


AA806043 


Hs.1 15396 


107300 


T40348 


Hs.90488 


134947 


R51194 




124579 


N68345 


Hs.127179 


130471 


Z68280 


Hs.183706 


116596 


D60755 


Hs.92955 


105069 


AA136345 


Hs23617 


102491 


U51010 




130069 


AA055896 


Hs.146428 


130234 


AA280413 


Hs.157441 


120540 


AA262992 


Hs.96417 


122508 


AA449221 


HS20432 



ESTs 0.084 

Human clone W2-6 mRNA from chromosome X 0.084 

cullinl 0.084 
zm89e5.s1 Stmtagene ovarian cancer (5937219) H sapiens cONA clone 
IMAGE:54512 3' similar to gb:X14723 CLUSTERIN PRECURSOR 

(HUMAN);, mRNA sequence. 0.084 

ESTs 0.084 

ESTs 0.084 

EST 0.084 

periodontal ligament fibroblast protein 0.084 

ESTs 0.084 
transglutaminase 2 (C polypeptide; proteln-glutamine 

^amma-glutamyitfansferase) 0.084 
ESTs; Moderately similar to alternatively spliced product 

using exon 13A [H.sapiens] 0.084 

EST 0.084 

myosin; heavy polypeptide 8; skeletal muscle; perinatal 0.084 
ESTs; Weakly similar to ION CHANNEL HOMOLOG RIC 

PRECURSOR [M.muscutus] 0.084 

ESTs 0.084 

ESTs 0.084 

inhibin; beta A (activin A; activin AB alpha polypeptide) 0.084 
ESTs; Weakly simfer to coded for by C. etegansrcDNA yk173c12.5 [Celegans] 0.084 

ESTs 0.084 

ESTs 0.085 

Mucin 3, Intestinal (Gb:M55405) 0.085 

lymphocyte antigen 6 complex; locus E 0.085 

ESTs; Weakly similar to FAST kinase [H,sapiens] 0.085 

ESTs; Moderately similar to unknown [R.norvegicus] 0.085 

EST 0.085 

ESTs 0.085 

partner of RAC1 (arfaptin 2) • 0.085 

ESTs 0.085 

ESTs; Weakly similar to ZINC FINGER PROTEIN 135 [H^apiens] 0.085 

lethal giant larvae (Drosophila) homolog 1 0.085 

gastrin-releasing peptide receptor 0.085 

ESTs 0.085 

ESTs 0.085 

GTP binding protein 1 0.085 
ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY II [H.sapiens] 0.085 

EST 0.085 

ESTs 0.085 

ESTs 0.085 

connective tissue growth factor 0.085 

ESTs; Weakly similar to similar to GTP-binding protein [Celegans] 0.085 

aminoacylase 1 0.085 

Rab geranylgeranyftransferase; alpha subunit 0.085 

EST 0.085 

nuclear cap binding protein 1; 80kD 0.085 

ESTs 0.085 

murine retrovirus Integration site 1 homolog 0.085 

prohibitin 0.085 

ESTs . 0.086 

ESTs 0.086 

Human germfine IgD chain gene; Oregion; Odefta-1 domain 0.086 

ESTs 0.086 
yj71a08.r1 Scares breast 2NbHBst Homo sapiens cONA clone !MAGE:154166 
5* similar to gb:L1 1284 DUAL SPECIFICITY MfTOGEN-ACTIVATED PROTEIN . 

KINASE KINASE 1 (HUMAN);, mRNA sequence. 0.086 
ESTs; Weakly similar to TERATOCARCINOMA-DERIVED GROWTH 

FACTOR 1 [H^apiens] 0.086 

adducinl (alpha) 0.086 

ESTs 0,086 

ESTs; Weakly similar to ZFOC1 gene product [H^apiens] 0.086 

Human nicotinamide N-mettyttransferase gene, exon 1 and 5' flanking region 0.086 

collagen; type V; alpha 1 0.086 

spleen focus forming virus (SFFV) proviral Integration oncogene spH 0.086 

ESTs 0.086 

ESTs 0.086 
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28054 
33020 
30056 
30504 
33978 
05265 
33035 
00768 
29338 
32789 
16099 
00721 
12569 
30645 
00751 
34550 
30885 
01446 
16287 
34034 
30860 
09901 
i07537 



21288 
08844 
29874 
05139 
24789 
15923 
23640 
31607 
30064 
08752 
I24249 
00109 
04642 
31752 
14727 
120965 
00396 
06218 
11562 
121219 
i01187 
01513 
16454 
16171 
17500 
19978 
32005 
09914 
30370 
04262 
29708 
06398 
20884 
30404 
14072 
31470 
24573 
14717 
33806 
30470 
33182 



AI205718 

AA053248 

AA017356 

U48865 

W73859 

AA227941 

T15965 

HG3636-HT3846 

T56800 

W23761 

AA456309 

HG3355-HT3532 

R73150 

AA020942 

HG3527-HT3721 

M27161 

AA338646 

M21302 

AA487856 

X89267 

U66061 

H04992 

Z20777 

AA496030 
AA085161 

AA401735 

M132916 

AA406488 

AA164543 

R43803 

AA441929 

AA609292 

AA351409 

T67053 

AA127070 

H68077 

AJ000480 

AA004662 

AA453311 

AA132545 

AA398089 

D84361 

AA428451 

R09567 

AA400606 

L20316 

M28210 

AA621071 

AA463434 

N31909 

W88623 

D58231 

H05529 

M55265 

AF009801 

AA417181 

AA447545 

AA365356 

X72012 

Z38184 

X54938 

N67935 

AA131240 

M12759 

AA398552 

Z80787 

AA452572 



Hs.125416 

Hs.185182 

Hs.171900 

Hs.158323 

Hs.78061 

Hs.26088 

Hs.6333 

Hs.47274 
Hs.56876 
Hs.58831 

Hs.75270 
Hs.17200 

Hs.85258 

Hs.20912 

Hs.56306 

Hs.155829 

HS78601 

Hil241395. 

Hs,30499 

Hs.9857 

Hs.6845 



Hs.97340 

Hs,177961 

Hs.1 81551 

Hs.1 10082 

Hs.78110 

Hs.38205 

Hs.1 12681 

Hs.172740 

Hs.181125 

Hs.71055 

Hs.108211 

Hs.143513 

Hs.184245 

Hs.31566 

Hs.1 90202 

Hs.179715 

Hs.151123 

H&91146 

Hs.187569 

Hs.144344 

Hs208 

Hs.27744 

Hs.42034 

Hs.42658 

Hs.44278 

Hs.59190 

Hs.173091 

Hs.194704 

Hs.155140 

Hs.105941 

Hs.1 20858 

Hs.18268 

Hs.97041 

Hs.76753 

Hs.123633 

Hs.2722 

Hs.194703 

Hs.252014 

Hs.76325 

Hs.15711 

Hs.240135 

Hs.43866 



ESTs 0.086 

ESTs; Highly similar to 40S RIBOSOMAL PROTEIN S10 {Rsaptens] 0.086 

armadillo repeat gene deletes in velocardiofadal syndrome 0.086 

CCAAT/enhancer binding protein (C/EBP); epsilon 0.086 

transcription factor 21 0.086 

ESTs 0.086 

ESTs 0,086 

Myosin, Heavy Polypeptide 9, Non-Muscle 0.086 

Homo sapiens mRNA; cDNA DKFZp564B176 (from done DKFZp564B176) 0.086 

ESTs 0.086 

regulator of Fas-induced apoptosis 0.086 

Peroxisome Proliferator Activated Receptor (Gb230972) 0.087 

GTP-binding protein homologous to Saccharomyces cerevisiae SEC4 0.087 

STAM-Iike protein containing SH3 and ITAM domains 2 0.087 

Luteinizing Hormone, Beta Subun'rt 0.087 

CD8 antigen; alpha polypeptide (p32) 0.087 

adenomatous polyposis coll (ike 0.087 

small proiine-rich protein 2A 0.087 

KIAA0676 protein 0.087 

uroporphyrinogen decarboxylase 0.087 

protease; serine; 1 (trypsin 1) 0.087 

ESTs 0.087 
ESTs; Weakly similar to peroxisomal shorVchairr alcohol 

dehydrogenase [Ksapiens] 0.087 

ESTs 0.087 
zn12c5.s1 Stratagene hNT neuron (#937233) H sapiens cDNA done 

IMAGE547283"simaartoTR:G1151228G1151228LPG1P.;,mRNAseq 0.087 

EST 0.087 

Human Chromosome 16 BAC done CIT987SK-A-388D4 0.087 

ESTs 0.087 

ESTs 0.088 

ESTs; Weakly similar to F17A92 [C.elegans] 0.088 

ESTs 0.088 

ESTs 0.088 

microtubule-assodated protein; RP/EB family; member 3 0.088 

immunoglobulin lambda gene duster 0.088 

ESTs 0.088 

ESTs 0.088 

phosphoprotein regulated by myogenic pathways 0.088 

KIAA0929 protein Msx2 interacting nuclear target (MINT) homobg 0.088 

ESTs 0.088 

ESTs 0.088 

ESTs 0.088 

Human mRNA for p52 and p64 isoforms of N-Shc; complete cds 0.088 

DKFZP586E0820 protein 0.088 

ESTs 0.088 

EST 0.088 

glucagon receptor 0.088 

RAB3A; member RAS oncogene family 0.088 

ESTs; Moderately similar to T-complex protein 10A [H-sapiens] 0.088 

ESTs 0.089 

ESTs 0.089 

EST - 0.089 

DKFZP434K151 protein 0.089 

leudne-rich; gfioma Inactivated 1 0.089 

casein kinase 2; alpha 1 polypeptide 0.089 

bagpipe homeobox (Drosophila) homolog 1 0.089 

ESTs 0.089 

adenylate kinase 5 0.089 

ESTs 0.089 

endogSn (Oster-Rendu-Weber syndrome 1) 0.089 

ESTs 0.089 

Inositol 1;4^trisphosphate 3-kinase A 0.089 

adaptor-related protein complex 4; mu 1 subunit 0.089 

EST 0.089 

Human Ig J chain gene 0.09 

KIAA0639 protein 0.09 

H4 hlstone family; member J 0.09 

ESTs 0.09 



204 



WO 02/30268 PCT/USO 1/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



132404 
122695 
125975 
110783 
129860 
120740 
119564 
134474 
119014 
109791 
117605 
121589 
104326 
129861 
102795 
119626 
110516 
105382 
123754 
108008 
121057 
123675 
135194 
127070 
134051 
133382 
103615 
118457 
118504 
112915 



101504 
112550 
128551 
112879 
127079 
101993 
113020 
120465 
130152 
104941 
110090 
135375 
123799 
118966 
116969 
125147 
100836 
114726 
107311 
112863 
129290 
103384 

112508 
111863 
131184 
107420 
111768 
112290 
130581 
120744 
112226 
116154 
102640 
129797 
102705 
132408 
108441 



AA393903 

AA456048 

AA495891 

N23669 

AA410343 

AA302650 

W38206 

AA054746 

N95435 

F10669 

N35073 

AM16627 

D81655 

N69507 

U88667 

W49499 

H56894 

AA236853 

AA609964 

AA039430 

AA398619 

AA609474 

C20975 

AA641812 

S67070 

AA1 12532 

Z46967 



N67334 

T10176 

AM70121 

M27288 

R71391 

H09058 

T03541 

AI364691 

U01062 

T23830 

AA251505 

U32645 

AA065169 

H16076 

AA480888 

AA620418 

N93438 

H80833 

W38150 

HG4113-HT4383 

AA132509 

T57738 

T03148 

AA521407 

X92762 

R68213 

R37495 

AA452705 

W26567 

R27606 

R53940 

AA481982 

AA302772 

R50761 

AA460951 

U67674 

X53595 

U77180 

AA035547 

AA079079 



Hs.4768 

Hs.99403 

Hs.152290 

Hs^6407 

Hs.129826 

Hs.96654 

Hs.8379 

Hs.55144 

Hs.13228 

Hs.44433 

Hs.191598 

Hs.143067 

Hs.129849 

Hs.198396 

Hs.184456 

Hs.37368 

Hs.111801 

Hs.102021 

Hs.61920 

Hs.142375 

Hs.1 12713 

Hs.9613 

Hs.190037 

Hs.78846 

Hs.7247 

Hs.115460 

Hs.49230 

H&50158 

Hs.4254 

Hs.243960 

Hs.248156 

Hs.29074 

Hs.237323 

Hs. 115960 

Hs,128628 

Hs.77515 

Hs.7303 

Hs.130861 

Hs.151139 

H&178G5 

Hs.6915 

Hs.99741 

Hs.112861 

Hs.76907 

Hs.143038 



Hs.103827 

Hs.174112 

Hs.4610 

Hs.110095 

Hs.79021 

H^28847 

Hs.23578 

HS23954 

Hs.4775 

Hs54185 

Hs£6016 

Hs.16258 

Hs.228649 

Hs.25738 

HS57100 

Hs.194783 

Hs.1252 

Hs.50002 

Hs.47822 



ESTs 0.09 

ESTs; Moderately similar to undulin 2 [H sapiens] 0.09 

ESTs; Highly similar to PACAP type-3/VIP type-2 receptor [H.sapiens] 0.09 

ESTs 0.09 

tetraspan transmembrane 4 super family 0.09 

EST 0.09 

Accession not listed in Genbank 0.09 

ESTs 0.09 

ESTs 0.09 

DRE-antagonist modulator; calsenilin 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs 0.09 

DKFZP564M182 protein 0.09 

ATP-binding cassette; sub-tamiiy A (ABC1); member 4 0.09 

ESTs; Wkly smlr to I! ALU SUBFAMILY SX WARNING ENTRY II [H.saptens] 0.09 

EST 0.09 

Homo sapiens mRNA; cDNA DKFZp564H2023 (from clone DKFZp564H2023) 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs; Moderately similar to putative envelope protein [H^aplens] 0.091 

EST . 0.091 

ESTs; Highly similar to angiopoietin-related proiin [H^apiens] 0.091 

ESTs 0.091 

heat shock 27kD protein 2 0.091 

ESTs 0.091 

calicin 0.091 

EST 0.091 

ESTs 0.091 

ESTs 0.091 

HLA-B associated transcript-3 0.091 

oncostatin M 0.091 

ESTs 0.091 

N-acetylgtucosamine-phosphate mutase; DKFZP434B187 protein 0.091 

ESTs 0.091 

ESTs; Moderately similar to CL3BC [Rjiorveglcus] 0.091 

inositol 1 ;4£-triphosphate receptor; type 3 0.091 

ESTs; Weakly similar to PROHIBITiN [Haptens] 0.091 

ESTs 0.091 

E74-fike factor 4 (ets domain transcription factor) 0.091 

ESTs 0.091 

ESTs 0.091 

ESTs; Weakly similar to BRAIN PROTEIN H5 [H^apiens] 0.091 

ESTs 0.092 

ESTs; Highly similar to HSPC002 (H^apiens] 0.092 

ESTs 0.092 

Accession not listed in Genbank 0.092 

Olfactory Receptor Or17-201 0.092 

EST 0.092 

ESTs 0.092 

EST 0.092 

ESTs 0.092 
tafazzin (cardiomyopathy; dilated 3A (X-tinked); endocardial 

fibroelastosis 2; Barth syndrome) 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly similar to KIAA0584 protein [Ksapiens] 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-5A [H^apiens] 0.092 

EST 0.093 

ESTs 0.093 

ESTs 0.093 

solute carrier family 10 (sodium/bite acid cotransporter family); member 2 0.093 

apolipoprotein H (beta*2-glycoprotein I) 0.093 

small Inducible cytokine subfamily A (Cys-Cys); member 19 0.093 

K1AA0380 gene product; RhoA-spedfic guanine nucleotide exchange factor 0.093 
zm97c9.s1 Stratagem) colon HT29 (#937221) Homo sapiens cDNA clone 
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108145 
106466 
101697 
121294 

117824 
115771 
102303 
131405 
112909 
124173 
112488 
130554 
106413 
111711 
117595 
113813 
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114966 

130297 
109589 
112592 
102314 
116128 
106809 



IMAGE545872 3' similar to contains element MER22 MER22 repetitive 

element „ mRNA sequence 0.093 

AA054133 Hs.63085 ESTs 0.093 

AA449990 Hs.76057 rysophosphofipase II 0.093 

M64358 Human mom-3 gene, exon 0.093 
AA401958 Hs.240170 ESTs; Moderately similar to alternatively spliced product using 

exon13A[H.saplens] 0.093 

N49065 Hs.125201 ESTs; Weakly similar to B7 [M.musculus] 0.093 

AA422049 Hs.40780 ESTs 0.093 

U33053 Hs.2499 protein kinase Ofike 1 0.093 

U79255 Hs.26468 amyloid beta (A4) precursor protein-binding; family A; member 2 (X1 1 -like) 0.093 

T10069 Hs.101094 ESTs 0.093 

H41281 Hs.107619 ESTs 0.093 

R66896 H&28788 ESTs 0.093 

X59303 Hs.159637 valyMRNA synthetase 2 0.093 

AA447964 Hs.6311 ESTs 0.093 

R22891 Hs.7093 ESTs 0.094 

N34933 Hs.44664 EST 0.094 

W45174 Hs.31382 ESTs 0.094 
AA018449 Hs,125220 Homo sapiens DNA from chromosome 19-cosrnids R30102:R29350:R27740 

containing MEF2B; genomic sequence 0.094 
AA250743 Hs.921 98 ESTs; Highly similar to caldum-regufated heat stable protein 

CRHSP-24[H.sapiens] ^ 0.094 

H94949 Hs.171955 trophinltvassisting protein (tastin) 0.094 

F02429 Hs,6581 ESTs 0.094 

R77631 Hs^9126 ESTs 0.094 

U34038 Hs.154299 coagulation factor II (thrombin) receptor-like 1 0.094 

AA459915 Hs.1 12193 mutS (E. coii) homoiog 5 0.094 
AA479704 Hs.220324 Human DNA sequence from clone 283E3 on chromosome 1p36.21-36.33. 

Contains the alternatively spliced gene for Matrix Metalloproteinase in the 
Female Reproductive tract MIFR1; -2; MMP21/22A; -B and -C; a novel gene; 









the alternatively spliced CDC2L2 gene for 


0.094 


130607 


AA043894 


Hs.16603 


ESTs 


0.094 


120592 


AA281929 


Hs.143974 


ESTs 


0.094 


117230 


N20535 


Hs.43265 


meiastatin 1 


0.094 


105948 


AA404597 


Hs.7133 


ESTs 


0.094 


101333 


L47738 


Hs.80313 


p53 inducible protein 


0.094 


101909 


S69265 




Homo sapiens mRNA for PLE21 protein; complete cds 


0.094 


106959 


AA497031 


Hs.8657 


ESTs; Highly similar to CTG7a [risaptens] 


0.094 


127034 


AA352389 




ESTs; widy smlrto glucose-6-phosphatase catalytic subunrt [anorvegicus] 


0.095 


134430 


H52105 


Hs.8309 


KIAA0747 protein 


0.095 


120342 


AA207105 


Hs.45068 


Homo sapiens mRNA; cDNA DKF2p434l143 (from clone DKFZp434H43) 


0.095 


104450 


L77564 


Hs.103978 


serine/threonine kinase 22B (spermkxjenesis associated) 


0.095 


130902 


AA424530 


H&21061 


ESTs 


0.095 


102708 


U77594 


Hs.37682 


retinoicacid receptor responder (tazarotene induced) 2 


0.095 


107373 


U85773 


Hs.154695 


phosphomannomutase 2 


0.095 


123569 


AA608952 


Hs.195292 


ESTs; Weakly similar to RNA helicase HDB/DICE1 [H^apiens] 


0.095 


102687 


U73379 


Hs.93002 


ubiquitin earner protein E2-C 


0.095 


128888 


AA034951 


Hs.106893 


ESTs 


0.095 


100283 


043642 


Hs.2430 


transcription factor-like 1 


0.095 


102747 


U79303 


Hs.82482 


protein predicted by clone 23882 


0.095 


107798 


AA019346 


Hs.60918 


EST 


0.095 


123565 


AA608907 


Hs.1 12614 


EST 


0.095 


116010 


AA449450 


Hs.56421 


ESTs; Weakly similar to Similarity to H.lnfluenza ribonuclease PH [Celegans] 


0.095 


117155 


H97536 


Hs.42391 


EST 


0.095 


133094 


AA1 15572 


Hs.64746 


chloride intracellular channel 3 


0.095 


113174 


T54659 


Hs.9779 


ESTs 


0.095 


102016 


U03270 


Hs.122511 


centrin; EF-hand protein; 1 


0.095 


130126 


AB002318 


Hs.150443 


K1AA0320 protein 


0.095 


134813 


X14767 


Hs.89768 


gamma-aminobutyric acid (GABA) A receptor, beta 1 


0.095 


132055 


N69440 


Hs.38132 


ESTs 


0.095 


122229 


AA436198 


Hs.103902 


ESTs 


0.096 


127574 


AA907314 


Hs.188905 


ESTs 


0.096 


134432 


AA053022 


Hs.8312 


ESTs 


0.096 


128052 


AA878398 


Hs.190491 


ESTs 


0.096 


101637 


M58285 


Hs.132834 


hematopoietic protein 1 


0.096 


103386 


X92972 


Hs.80324 


protein phosphatase 6; catalytic subunrt 


0.096 


133079 


AA477561 


Hs.6449 


ESTs 


0.096 


120328 


AA196979 


Hs.104129 


ESTs; Weakly similar to protease [H.sapiens] 


0.096 
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107640 


AAAACC4C 

AAU09815 


Uc 0C7QAQ 


fcolS 




123389 


AA521176 


m. oo< on 


COT* 

ESTS 


u.uyo 


103222 


X74795 


HS.77i7i 


minicnromosome maintenance oeiicieni (o. cerevistae) o [cqu omskmi cyae 4o) 




111704 


R22450 


Hs.23396 


ESTs; Highly similar to ZINC FINGER protein 140 [H.saplensj 


A AOC 

u.uyo 


126656 


AA306523 




EST177475 Juncat T-ceiis VI nomo sapiens cuna o end, mHNA sequence. 


A "70Q 


127071 


AA250806 




ESTs 


A AAC 


114550 


AA056755 


Hs.151714 


ESTs 


0.096 


125955 


A1356943 


Hs.1 43761 


ESTs 


A AOC 

u.uyo 


134363 


M37033 


Hs.82212 


CD53 antigen 


0.096 


128550 


W76492 


Hs.1 70142 


ESTs 


A AAA 

u.uyo 


122598 


AA453465 


Hs.99329 


ESTs 


A AOC 

0.036 


118898 


N90703 


Hs.4236 


KIAA0478 gene product 


0.096 


117661 


a IAAAAA 

N39092 


Hs.44940 


ESTs 


0.096 


120996 


AA398281 


Hs.1 43684 


ESTs 


0.096 


123388 


AA521172 


Hs.134417 


ESTs 


0.096 


106700 


AA463929 


Hs.28701 


ESTs 


0.096 


112962 


T16814 


Ks.6828 


ESTs 


0.096 


121262 


AA401372 


Hs.97723 


ESTs 


0.096 


134551 


R44839 


Hs.8526 


hbeta-1;3-N-acetylglLicosaminyitransferase 


0.096 


112060 


R43754 


Hs.21164 


ESTs 


0.096 


134678 


AA039935 


Hs.182595 


dynein; axonemal; light polypeptide 4 


0.096 


100855 


HG4234-HT4504 




Methytenetetrahydrofolate Reductase 


0.097 


132414 


N91193 


Hs.48145 


ESTs 


0.097 


112900 


T08758 


Hs.3813 


ESTs 


0.097 


115989 


AA447777 


Hs.93135 


ESTs 


0.097 


103561 


Z21488 


Hs.143434 


contactin 1 


0.097 


131087 


AA009738 


Hs.22824 


ESTs; Weakly similar to p160 myb-binding protein [M.muscu)us] 


0.097 


120293 


M190859 


Hs.191428 


ESTs 


0.097 


111830 


R36081 


Hs.25085 


EST 


0.097 


113654 


T95770 


Hs.17666 


ESTs 


0.097 


132675 


AA179338 


Hs.5476 


serine proteinase inhibitor 


0.097 


120182 


Z40125 


Hs.91968 


ESTs 


0.097 


132879 


U16282 


Hs.5881 


ELL gene (11-19 rysine-rich leukemia gene) 


0.097 


134211 


AA056681 


Hs.80021 


ESTs; Weakfy similar to 62D9,p [0 jnelanogasterj 


0.097 


115448 


AA284845 


Hs.165051 


ESTs 


0.097 


118118 


N56901 


Hs.47995 


ESTs 


0.097 


107598 


AA004528 


Hs.1 69444 


ESTs 


0.097 


128933 


H01824 


HsJ60 


GATA-binding protein 2 


0.097 


114892 


AA235988 


Hs.86024 


ESTs 


0.097 


101922 


S75168 


Hs.274 


megakaryocyte-assodated tyrosine kinase 


0.097 


105444 


AA252374 


Hs.19333 


ESTs; Weakly similar to ATP(GTP)-binding protein [H.sapiens] 


0.097 


126155 


AA926843 


Hs.143302 


ESTs 


0j097 


116276 


AA485870 


Hs.44914 


ESTs 


0.097 


111964 


R41227 


Hs^1860 


ESTs 


0.097 


135100 


AA398926 


Hs£51108 


Homo sapiens mRNA; chromosome 1 specific transcript KIAAD493 


0.097 


124872 


R692S1 


Hs.101506 


EST 


A AA*T 

0.097 


103084 


XS9932 


Hs.77793 


c-src tyrosine kinase 


0.097 


124138 


H23199 


Hs.107010 


ESTs 


0.098 


130048 


R31745 


Hs.211612 


SEC24 (S. cerevfeiae) related gene family; member A 


0.098 


100208 


D26129 


Hs.78224 


ribonuclease; RNase A family; 1 (pancreatic) 


A ACQ 


123537 


AA608775 


Hs.1 12589 


ESTs 


A AAQ 

0.098 


118999 


N95019 


Hs.55092 


ESTs 


0.098 


119847 


W80384 


Hs.9853 


ESTs 


A AQQ 

u.uyo 


112819 


R98618 


HS.35984 


ESTs 


A ACQ 

u.uyo 


131080 


J05008 


Hs.2271 


endotheKn 1 


0.098 


127353 


Ml 80853 


U„ 4 croon 

nS.155ooU 


cot*. 
cols 


n ACQ 

u.uyo 


132068 


X66365 


II AAiAl 

Hs.38481 


cycGn-dependent kinase 6 


0.098 


i(V:7 A A 


AA9Q343A 


H<; 12909 


ESTs 


0.098 


133680 


M92357 


Hs.101382 


tumor necrosis factor; alpha-Induced protein 2 


0i)98 


122899 


AA469960 


Hs.178420 


ESTs; Highly similar to WASP Interacting protein [H^apiens] 


0.098 


128700 


U59286 


Hs.103982 


small inducible cytokine subfamily B (Cys-X-Cys); member 1 1 


0.098 


104393 


H46486 


Hs.226499 


nesca protein 


0.098 


123320 


AA496792 


Hs.139572 


EST 


0XJ98 


129169 


N31641 


Hs.109058 


ribosomal protein S6 kinase; 90kD; polypeptide 5 


0.098 


135093 


U51333 


Hs.159237 


hexokinase3(whlte cell) 


0.098 


113269 


T65159 


Hs.85044 


ESTs 


0.098 


124283 


H86783 


Hs.194136 


ESTs; Moderately similar to zinc finger protein RIN ZF [R^orvegicus] 


0.098 


114376 


GMCSF 




Accession not listed in Genbank 


0.099 


100881 


HG4458-HT4727 




Immunoglobulin Heavy Chain, Vdjc Regions (Gb:L23563) 


0.099 
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116572 


D45654 


Hs.65582 


OKFZP586C1324 protein 


0.099 


400ACC 




Lie 110Q/I7 

nS.ll £04/ 


COI 


0.099 


4AAO-1Q 

lOUolo 


nvj4Ulo-n l4<:oo 




f^r\\iwLRmA\nn Poll Arlhncinn Mntaraffa 
uptOKruiriQUiy voii mujiuoium i*iujouu«> 


0.099 


132754 


W47419 




n urn an una ircm enromosorne iy-j^wiicw»iniu rcotjoo, genomic seijuBnce 


nnoq 


112741 


R93080 


Hs.35035 


cSTS 


0099 


112748 


R93299 


Hs.1 66492 


rpT- 

fcols 




130858 


S57235 


II. ninnni 

Hs.246381 


CD68 antigen 




124870 


R69233 


Hs.101504 


POT/. 

ESTS 


n noo 
u.uyy 


125304 


Z39833 


Hs.124940 


/*Tn L!_JJ__ nrn l n | n 

GTP-Dindlng protein 


u.uyy 


121297 


AA401995 


Hs.97860 


cSTS 


u.uyy 


128602 


AA046103 


Hs.1 02367 


ESTs 


n noo 
u.uyy 


124062 


H 00440 


Hs.144524 


ESTs; WeaWy similar to signal transducer and activator of 
transcription 2 [M.muscutus] 


u.uyy 


100547 


HG2149-HT2219 




Mucin (Gb:M57417) 


A OAA 

u.uyy 


105652 


AA282505 


Hs.19015 


ESTs 


A nnrt 

u.uyy 


133390 


AA459945 


Hs.72660 


KIAA0585 protein 


u.uyy 


133503 


M33195 


Hs.743 


Fc fragment of tgt; ntgn affimty l, receptor tor; gamma polypeptide 


n AQQ 

u.uyy 


109461 


AA232667 


Hs.58210 


ESTs 


a nfifl 

u.uyy 


102068 


U09117 


Hs.80776 


phosphotipase C; delta 1 


u.uyy 


113464 


T86931 


Hs.16295 


ESTs 


A AOG 

u.uyy 


104240 


AB0Q2368 


Hs.70500 


KIAA0370 protein 


A AGO 

u.uyy 


121113 


AA399109 


Hs.1 61813 


ESTs 


A 4 
U.l 


122896 


AA469952 


Hs.97899 


ESTs; Weakly simitar to dal2; len:343, CAI. Q.17; ALGJrcAST P25335 
ALLANTOICASE [Sxerevisiae] 


A A 
U.I 


102405 


U43148 


Hs.159526 


patched (Drosophiia) homolog 


0.1 


103599 


Z33905 


Hs.81218 


receptor-associated protein of the synapse; 43kD 


0.1 


121079 


AA398719 


Ks.14169 


ESTs; WeaWy similar to CREB-binding protein [H^apiens] 


A H 
U.l 


115820 


AA427487 


H&39619 


ESTs; WeaWy similar to RETICULOCALBIN 1 PRECURSOR [H^apiens] 


0.781 


125106 


T95766 


Hs.189760 


ESTs 


0.1 


131373 


N68116 


Hei26146 


Down syndrome critical region gene 3 


0.1 


120224 


Z41239 


Hs.106960 


ESTs 


0.1 


133090 


AA448228 


Hs.6468 


ESTs 


0.1 


132300 


AA133244 


Hs.44234 


ESTs 


0.1 


113129 


T49384 


Hs.8988 


EST 


0.1 


110638 


H73197 


Hs.17241 


ESTs 


0.1 


131364 


R53255 


H&26010 


ESTs 


0.1 


105370 


AA236476 


Hs22791 


ESTs; Weakly similar to transmembrane protein with EGF-Gke and two 


0238 








follistatin-like domains 1 [H^apiens] 
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TABLE 1 1 A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 11. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 

100610 19864 1 AW161357 A1879062 AI928938 AW161097 AW161 167 BE314465 AA351715 F0709fi AA179034 F08510 F00653 AI936671 

AA476718 AW772454 AI807703 R44253 AA976667 AI985186 AI650254 H38942 R84829 AA018724 AA001000 H85934 
AA019126 H85609 AA017000 AA339355 AW950556 D51397 AA213981 BE548002 AI056359 AA001560 AW9521 13 
AA317769 AI857477 AI857475 AW249771 AW162661 H38943 AA018628 R85885 AI984613 AI934765 AI796172 AW157488 
AI929191 R85523 D51221 D53851 H85610 AI749674 F21582 AA323145 AA019127 AA687444 T06745 AI699293 H29532 
AA214029 AA223656 NM 016834 X14474 R19697 H09695 R17455 R13812 R19056 A1681231 AI590200 R37671 AA861828 
AI990023 AI935669 AW005821 AA324581 H17335 R37659 R42802 R46242 R60936 R59731 H28993 AM79907 R44570 
AI890696 AA308884 AA507078 R41274 AI365507T16348 AI560453 F03259 F04722 T16312 AA016081 AW073061 
BE314824 W28930 R44098 R51045 

100674 21517 2 AW403342 AW248986 BE561709 AA357312 BE31 1834 BE389496 BE294887 AW732696 BE047868 AI702383 BE019155 

AI702367 BE408966 BE280458 BE313759 BE513492 BE535404 BE280258 AC005263 NM_007165 L21990 AW732711 
AI564920 AW249094 BE265365 AW607186 AW607346 BE005217 H2721 1 U46230 BE260066 BE207Q43 BE546782 
AW248659 

108559 41469 9 AA085228AA085161 

100721 19818J L40904NM_005037 X90563AB005526H21596AA088517 

100748 41861J X06096 X05826 

100750 15759J BE157260 BE157265 R481 18 H43827 Z17877 AW379070 AW291778 M20605 J03253 M14206 V0O568 AI860465 AW296022 

M13930 AL047400 J00120 BE018476 AW675223 T26980 F06694 R22709 R24720 H22753 A1903100 AI903094 AW937823 
X00364 D10493 K01904 K01906 K00535 LO0058 AA410662 AW384760 AA304930 AI680985 X00198 H58025 AW998901 
AV653447 N31654 AW610357 AW610369 AW862480 BE223010 AW384172 AW384219 AW384171 AW384218 AA298522 
BE140421 AW94S162 AW75171 1 AA514409 AW747912 AI214214 W87741 AA972406 AA554513 BE302O87 AI249030 
AA477850 AV653129 AE81360 AG74110 W87861 AA641366 X66258 AI051600 AA877139 AA527483 AA857219 AI250782 
AA625531 AA807892 AJ278811 A1224033 H24033 AA593396 AW129709 R45453 N22772 AA235530 T29737 AI016409 
AI688907 AA568370 AA722760 A639329 AA550843 AW674698 AI538452 A1538453 AI337957 AA477744 AA464600 
AI140319 AW949294AI339781 A1828736 AA923634AA344094 AI278350 AA975567 AA908416 AA857170AW023520 
R43413 R48004 F02958 A1989439 R1 1207 AA737307 D10493 AW950652 AI093842 A1474024 AA703369 R11264 M13930 
M13930 M13930 M13930 M13930 J00120 M13930 M13930 X00364 J00120 R19507 AA639812 

100751 24700J N32759 N29730 N3G831 N32604 N31955 AI206390 H87574 R23494 AI186215 N30036 AI741512 J00117 NM__000737 

AM53626 AA330974 AI188729 AI188604 AI188964 N30276 AI188947 A1188830 AI188303 AI200457 AI219166 AI192459 
A1183280 AI189275 A1188639 AI186353 AI189616 AI184224 AI130720 A1188454 A1188391 AI148857 AI192447 AI209155 
AI190013 A1206355 AI188721 AI189429 AI189364 AI186330 AI431595 AI189595 AI188781 AI148647AI200022 AI221552 
AI220923 AI188728 AA233034 AI1 89807 AI189641 AI219044 AH 48774 AI200658 W71 989 AI207360 A11 88824 A1200559 
AI20O270 AA644163 All 99943 AI151301 AI189555 AI262724 AI148590 AI148695 AI126906 AI149163 K03183 K03189 
AI189842 AI221014 N30608 AI186465 AI220865 A1188498 A1138226 AI189968 AI221019 AI138197 AI149426 AI148904 
AI186218 AI188348 A! 160579 A1198460 AI149039 AI160936 AI21 9055 A1184784 AI221580 AJ161082 A1160814 AI123896 
A1417614 AI126101 AI1 88872 AI149571 A11 68533 AI14S072 A1149467 AI131286 N30684AI1 60705 A1160692 AI149559 
AI273580 AI189442 AI138448 AI149591 N27302 AA400910 AI138431 AI138435 AI128407 N30216 AI128296 A1219589 
AI188492 AI149447 AI168482 H95374 AI219009 N31616 AI276216 N32233 AI291937 N30741 AI188689 N27111 R23214 
AI221605 AI184348 AI200375 H94451 N26397AI871881 AA232905 N30833 AI220780 H94446 N30822 H87464 R68815 
N30290 AI128424 H12587 T47334 H87631 H87156 AI219133 AI868741 AA330859 H86993 AA330413 H93656 N30817 
T90191 H93668 A1200054 H95207 T47316 H95381 T49170 R00880T49171 N27381 H94107 R63352 T85053 AW451899 
H95142 N30313H94015 H86987 T28278 N29701 C18834 AA331267 AA330939 A1654493 N27073 N29831 R68113N30758 
R26086 N32108 H95135 AA330414 AA330978 AI219422 AI189453 Al 199951 X00264 NMJJ00894 AA371909 AA063496 
T29543 AA371971 AA372026 AA371978 AA371346 AI051683 A1186418 AI220659 AI189068 AI219266 AI186552 AI188715 
AI149156 

100760 1334 7 AW794626 M27126 M27014 

100775 18179 3 J05581 M61170T27692 M34088 M34089 AW860335 AW579047AW610437AW610386AW610422 AW610473AW579078 

AW604897 AW860163 AW579067 AW862410 AI816584 AW177757 AW602769 AI909790 AW860331 AI909787 AI909811 
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100800 24735J 



100818 19604.3 

100881 458.127 

100885 12707.3 
100898 8542J 



102459 3556J 
126126 1630017J 
102620 16821.37 
102673 24986 6 
102675 5145.4 
102753 2226 1 
102799 34624.4 
127034 51148.2 

103522 21640.1 



127071 188097 J 

55 126456 291965J 

119388 1762256 J 

126856 20669.1 



60 103996 224545.1 



113213 23798.1 



134947 844579.1 
129311 16078.1 



AI909813 AW845083 AI905920 AW387919 BE140766 AI909279 AW369405 AA429321 AA429320 AA367451 AA847972 
AW001137 AI567905 T84561 A1631295 AA151351 H02932 AI884519 AA367457 AW369421 AI678846 AW391803 AI610869 
AW1 92838 AI922289 AI952140 A191 0233 AI479474 AW001 395 AA488073 AI985760 AW1 30017 A1858369 AA627845 
AW081805 AA158865 A1624443 AA344985 AA569793 R72486 AI589329 AI903204 AI269893 AA641284 AI279932 AA149270 
AI697120 AA729146 AI589353 AA480067 AI923310 AA530908 A1275395 AA425062 AA58Q280 AA889527 AA158866 
AW131341 AA573028 AA877326 T29335 AW951288 H04235 AA099243 AA994659 AI659618 AA887919 AI299297 
AW001 1 16 AW263844 AI270578 AA970828 AW572126 AA775299 AW369449 AW369398 AW369452 AI933677 AI870710 
AI09291 1 A682464 AI497674 AA937026 AA885865 L38597 AA908325 AW369432 AWQ26623 AA627778 AI264942 
AA932409 AI187328 AI672970 AI886098 AW440471 AW138860 AI866658 AI802528 AI926172 AW243914 AI933690 
AA9961 14 AA536189 AW009937 AI918060 AI270379 AI973169 AW175638 AW369413 

NM_006227 L26232 R50649 AU077024 AL008726 AM11079 R35151 BE278153 BE278139 A1459777 RB8036Z43210 
F07326 AF052157 R17844 BE615476 T82160 R71985 H21963 AA299158 AW368246 R48123 R50628 R70441 H27245 
H72015 R72345 R39392 AI909738 BE612778 BE613234 0521 16 D52136 D52132 052067 051922 051995 051905 N34249 
N25459 AA464436 AA297350 AA297466 R81736 H02737 AW562505 R27523 AI834241 AW130867 W72668 W76426 
AA358363 R50262 AW473860 H52335 H43953 H21964 T39505 AI887517 AW156925 AW839850 H02628 AW007705 
AI561008 F22392 R71279 AA995433 R50725 W24462 R71931 AA464437 AW591731 R25667 R52695 R50810 AI560805 
AI089266 H68386 H41353 H28590 AW001860 AI141623 AA250773 AI284778 AW511412 AW083975 AA130377 AWG26047 
R50551 R81494 AB57668 AI078272 F32666 F36981 AW304865 H43906 AA931068 R48010 A154Q217AI017339 AI291812 
AI741954 AA458490 AI088378 AA298764 H61168 AA358362 AA298725 AA298515 AA464148 AA443538 R43046 AA084314 
T40641 T47608 T48940 A1082477 AW470145 N92284 AI758958 AA298512 AA284586 A1597777 AA480277 AI932559 
A1869081 AA476615 AA503651 AI656024 AW168522 AI682051 AI689106 A1274592 AB20917 BE258916 BE615861 
BE280282 R53386 BE278255 BE278398 T47607 AM77662 H68385 

100817 19648.1 L34355 L46810 NML000023 U08895 AA424260 AI097272 AA&4162 N79764 F19290 F25278 A1479385 
AA460662 AA432059 AW016935 F25770 F32549 F36677 F33016 F35992 F36010 AW172497 AA835076 F28727 AA21 1643 
AA453282 

U79251 AA843851 R38201 R66461 R44908 AA683289 H17477 R37364 R52832 AW298336 AA351391 NM_002545 L34774 
AA296886 AW967001 T28889 R13451 T77331 AL1 19196 AL118830 H08459 AW892812 AW905838 H17585 R52878 
BE561958 BE561728 BE397612 BE514391 BE269037 BE514207 BE562381 BE514256 BE514403 BE514250 BE397832 
BE269598 BE559865 BE396881 BE560031 BE514199 BE560037 BE560454 
X07881 NMJJ06249 X07637 AA376715 AA376677 X07715 X07704 S80916 

BE387614 R51501 M199714 AW674779 F08178 BE269071 AA376313 H08264 AA380420 H18785 AL042151 BE277758 

BE267438 NMJW5850 L35013 BE540833 BE390902 BE391494 BE277459 BE385592 BE390612 BE384263 BE387779 

BE388647 BE537373 BE547158 AW409585 AW374033 AW602185 AA355725 AW577548 AW935015 AW935160 W40232 

AW938647 AW374332 AA434040 BE293488 AL138361 BE560260 AI745075 AA317980 AW949382 AI83431 1 AI653582 

A1831042 AI361878 AA618606 AA729052 A1424969 AA199715 AW769374 A1828422 AW044307 AI862816 AI203583 

AW084461 AW514655 AA831883 AA290672 AA831286 AA578510 AW089965 AW150746 AA292743 H22232 AI469275 

AW439312 AA292744 AW471443 AI473989 AA593336 AA464070 AI678937 AW069451 AA970763 AA610480 AA593328 

AA464009 AA768985 AI298928 AA436600 AA464718 AA699361 D61482 D55935 AI369591 AA470695 AI809135 AA640627 

AI568446 R51502 W45467 AI655316 AA463934 AW168609 AW518663 BE045525 Z41251 AI868091 AA908160 AI026697 

AI886259 AI612932 AA215437 AI956014 BE541087 BE255652 BE265878 BE394102 W27502 

U48936 L36592 X87160 NM.001039 AL036606 AL036420 U35630 AW298574 

W80551 M85370 

AA976427 U66052 

AI457648U72509 

U72512 T98357 R31335 F18090 

L32961 NM.000663 U80226 S75578 AA425061 AA429317 AI815143 AA910669 AI286022 AI286019 
U88896 U88898 AA91 6056 T03285 AI341594 AI359534 AI634031 U88897 

BE397750 AA232171 BE562900 BE384894 BE242228 BE206819 BE261742 AA296468 AW959763 BE276164 BE264109 
BE392626 BE256735 AA301453 N55872 H01676 AA292746 AA427485 AA498400 AA352389 

Y10518 Y10514 ZB3935 Y10508 AK000055 Y10519 AI142012 AI681 175 BE222219 AA890586 BE504347 BE328064 N63044 

N51226 AI151248 AI521996 AI924777 AW375954 A1860275 W00549 AI742673 AW612288 AI763062 AA632510 AI087347 

A1088070 AI214349 AA890297 AI494156 AI698598 AA631658 AA504593 AA860733 AI266761 AW663214 AW771231 

AA639610 AI769806 AI769746 AW014326 AI28861 1 

AA250806AA459220 

AA429212 W00881 

T88798 R92430 

AI084125 AI083773 AI479687 AI939609 A1968662 AF129507 NM.013282 AW971840 AW298508 AA744240 AA811217 
AA827671 AA81 1055 AA806567 AA488977 AA908902 AI637637 AA927056 AI870139 AW340492 AA488755 AA129794 
AA306523 AA354253 BE256277 AC053467 AW962084 

AA321355 AW964592 R23284 H73883 R23382 N47914 C01377 H04668 AW606248 R34447 AA847136 AI684489 AI523112 
AW044269 AI379138 N29366 AA761543 N79248 AA960845 AA768316 AI147926 AI718599 AI880620 R67467 AI216016 
AI738663 H04648 

NM.001395 Y08302 AI434619 AI470328 AI261807 AW024965 AI806537 AI830549 AI640337 AI219065 AW271700 
AW028488 AI133339 AI859205 R51175 U87167 BE379324 BE392008 AA340819 AA3431 10 T57275 D59164 AW299312 
AJ434422 A1936390 AW024975 R40262 

AW269126 R09430 T56590 AI367247 AI253132 BE464248 T58658 AW207785 T58607 
R51 194 A1732276 R53587 AI820697 

AK000526 BE550084 W30689 AW271859 AA411456 AI341551 AA242990 AA243027 H87046 D20360 AI184053 AA146956 
AI721023 AI718944 AA146955 F18215 AA903890 A1700355 A1075430 AA411584 AA878210 AI476760 AW945637 AA630596 
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AA431522 AA301989 AI909058 D12149 N41960 BE222214 AA609922 AA828176 AA393359 AA398693 AW024956 
BE467805 AW298623 AW264085 A1Q24454 AI024719 AI431927 T55087 AI61 1014 T54920 AA131253 AI436344 

114427 9724_2 AA017176 AI359979 AA047836 AA01 7063 AA01 6303 AA001545 

114569 110077 J AA063315AA063316 

100106 1562L-5 AF015910 

100515 342 1 AA305746 D90187 T63943 AW951 154 T29182 AI734941 D13264 A1299239 Z18812 AW299859 W24476 AA933064 

AA489759 

100531 46038 1 AW888554 AW607282 AA319986 M28590 

100545 22955.11 M55405 AW752552 

100574 17320_2 AA326895 M1 0036 NM 000365 N84665 H6941 4 N84657 AA380453 AA329743 AA357367 AA1 88770 AA376532 AA353653 

AA158953 AA083176 BE537313 AA181433 D53373 R57376 AA206698 R14807 H18899 H1 1191 H93892 R25593 T61 134 
N93285 AA083081 AA831789H13137AA497014AA079330 AA182861 H13138W47161 R62913 AA687089AA211112 
AA429237 AL035923 AA 100070 AW392898 AI566433 AA866006 AA214002 AW392865 N79454 AA197181 AJ680371 
AA176501 M737967 AI089225 F34874 AW571437 AI620620 AA573489 AA423816 AA164917 AA458455 T47072 AI569087 
AI261656 M730919 AI633441 AW195182 AI351622 AW243465 AI872649 AI359227 AA987941 AI693770 T47073 AW779948 
AW510580 AI635626 AW627601 AA864326 AA953578 AI341418 BE222853 AI241963 AI094663 AA928380 AA493373 
AW043762 AI377783 AW958987 BB619760 AA385240 BE277975 BE280095 AW631443 AA581048 BE61871 5 BE299610 
C14874 BE559858 BE378455 BE618290 BE544585 AI525575 BE548897 BE2671 10 AA804738 BE269821 AA918133 
BE277647 AA599947 BE280735 BE390239 N74150 T12504 AI2081 97 AW955527 AA1 13897 N40081 H73835 H70393 
AI434041 W22950 A1192661 BE264461 W26486 AA626424 AA196694 T69209 AA857976 AI540267 AA410599 AA864287 
AW950564 AA013320T49283 AI541438 AW804703 AA335534 AA335659 BE562269 BE618802 BE277850 BE546413 
BE280994 AA204813 BE561694 BE543524 BE253647 AW001452 W191 16 BE542508 AA205894 BE254875 BE270033 
AI525906 BE251792 AA975700 BE272138 AW607671 N87686 M10036 BE515060 BE298607 AI745178 U47924 H03193 

100627 tigrHT2798 Z25424 

100756 tigr_HT3768 M88357 

100768 tigr_HT3846 L29141 M69180M81105 

100813 figrJfT4265 L33999 

100836 tigr_HT4383 U04688 

100855 tigr_HT4504 U09806 

102104 entre3LU12139 U12139 

125091 genbanKJT91518 T91518 

100929 tlgr_HT688 X65561 

125147 _entre*_W38150 W38150 

102354 entrezJJ38268 U38268 

102491 entre^U51010 U51010 

102636 entte*_U67092 U67092 

118769 genbankJI74496 N74496 

101046 efitre^_K01160 KO1 160 

101057 entre*_KQ3430 K03430 

108334 genbanK_AA070473 AA070473 

108417 483241J AA070853 AA075749 AA075716 

108441 genbank_AA079079 AA079079 

108786 genbank_AA128999 AA128999 

101655 entre?_M60299 M60299 

101697 entre?_M64358 M64358 

117437 genbanleN27645N27645 

101798 entrezLM65220 M85220 

101909 entre*S69265 S69265 

103508 entre*Y10141 Y10141 

103575 entre*_Z26256 Z26256 

119332 genbanKJ54095 T54095 

112161 genbanleR48295 R48295 

119564 NOT„FOUNDjen1re?_W38206 W38206 

114376 NOT_FOUNDjentreJLGMCSF GMCSF 

100478 tlgr_HT1067 M22406 

100547 tigr_HT2219 M57417 

100564 tigr_HT2324 Z11585 
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TABLE 12: shows genes, including expression sequence tags, that are down-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: Unique Eos probeset identifier number 

ExAocn: Exemplar Accession number, Genbank accession number 

UntgenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Background subtracted normal prostate : prostate tumor tissue 



Pkey ExAccn UnigenelD Unigene Tide R1 

100522 HG1763W1780 Prolactin-lnduced Protein 17.4 

130803 M81650 Hs.1968 semenogeDn! 16.785 

118068 N53943 Hs.13743 ESTs . 13.225 

114251 Z39898 Hs51948 ESTs 12.7 

112134 R46025 Hs.7413 ESTs "8.735 

101436 M20642 Hs.158295 Human alkali myosin light chain 3 mRNA; complete ods 8.175 

104028 AA361094 HS521128 ESTs 8.15 

108944 AA149204 Hs.175783 ESTs; Highly similar to growth arrest inducible gene product [H.sapiens] 7.535 

103838 AA174173 Hs.12622 ESTs 7.212 

120469 AA251741 H&25882 DKFZP586M1824 protein 7.175 

110279 H29231 HS57384 ESTs 6.701 

127472 AA761378 Hs.192013 ESTs 6.642 

133301 N35229 Hs.7037 pallid (mouse) homolog; pallidin 6.411 

102457 U48807 Hs5359 dual specificity phosphatase 4 6.395 

114011 W90385 Hs.15082 ESTs 6.15 

101249 L33881 Hs.1904 protein kinase C; iota 6 

123265 AA491209 Hs.105265 ESTs; Weakly similar to reverse transcriptase [M.musculus] 6 

1 19322 T49655 Hs541569 ESTs; Modly smlr to II ALU SUBFAMILY SQ WARNING ENTRY !! [H.saplens] 5.95 

101673 M61906 Hs.6241 phosphomositide-3^nase; regulatory suburb polypeptide 1 (p85 alpha) 5.925 

115586 AA399218 Hs.92423 ESTs 5.7 

120590 AA281780 Hs.111441 ESTs; Weakly similar to similar to KruppeWike zinc finger protein [Celegans] 5.7 

109748 F10192 Hs548323 Tubulin; alpha; brain-specific 5.625 

134727 X80507 Hs.8939 yes-associated protein 65 kDa 5.5 

129171 AA234048 HsJ753 calumenin 5.486 

120390 AA233122 Hs.111460 ESTs; Highly similar to multifunctional caldurn/calrnodulin-dependent protein 

kinase II delta2 isoform [H.saplsns] 5.4 

131699 R68657 Hs.90421 ESTs; Modly smlr to !! ALU SUBFAMILY SX WARNING ENTRY 11 [H.sapiens] 5579 

104490 N71503 Hs.43087 ESTs; Weakly similar to dysferiin [H.sapiens] 5566 

102124 U14528 Hs59981 solute carrier family 26 (sulfate transporter); member 2 5.151 

109280 AA196635 Hs.86081 ESTs 5.134 

109707 F09739 Hs.185701 Homo sapiens mRNA full length insert cDNA clone EUROIM AGE 21 920 5.075 

108087 AA045709 Hs.40545 ESTs 5.075 

135006 M21665 Hs.929 myosin; heavy polypeptide 7; cardiac muscle; beta 5.055 

119182 R80664 Hs.77067 ESTs - 5.033 

129806 R62444 Hs.173373 KIAA0931 protein 4.675 

101435 M20543 Hs.1288 actin; alpha 1 ; skeletal muscle 4.626 

125954 R93943 yt72c12.rl Soares retina N2b4HR Homo sapiens cDNA clone IMAGE275735 5' t 4.6 

113989 W87544 Hs521184 ESTs 4.559 

104432 J03460 Hs.99949 prolactin-lnduced protein 4.451 

112326 R56068 Hs.4268 ESTs 4.45 

119063 R16833 Hs£3106 ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY U [Ksaptens] 4.45 

130376 R40873 Hs.155174 KIAA0432 gene product 4.301 

122484 AA448286 Hs.98074 ESTs; Highly similar to atrophin-1 Interacting protein 4 [H^apiens] 45 

104142 AA447006 ESTs; Moderately similar to U ALU SUBFAMILY SQ WARNING 4.175 

129413 N32767 Hs.11123 ESTs; Moderately similar to hypothetical protein 2 pisaplens] 4.1 

103678 284483 Human DNA sequence from PAC 46H23, BRCA2 gene region chromosome 13q12-134.05 

114266 Z40186 Hs56409 ESTs 4.05 

115206 AA262491 Hs.186572 ESTs 4.048 

123723 AA609749 Hs.1 12759 ESTs; Highly similar to unknown protein [Rjtorvegicus] 4.041 

129130 H97993 Hs.172788 ESTs; Weakly similar to KIAA0512 protein (Rsapiens) 4.028 
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120217 Z41078 Hs.66035 ESTs 4.028 
I08536 AA084S24 zn19d8.sl Stratagene neuroepithelium NT2RAMI 937234 Homo sapiens cONA 4.023 

I34460 AA4O0030 Hs.8360 ESTs; Weakly similar to 11 ALU CLASS B WARNING ENTRY !! [H^aplens] 3.925 

120418 AA236010 Hs26613 Homo sapiens mRNA;cDNADKFZj>586F1323 (from done DKFZp586F1 323) 3.91 

I32783 N74897 Hs.5683 DEAD/H (Asp-Glu-A!a-Asp/His) box polypeptide 15 3.889 

125052 T80174 Hs222779 ESTs; Moderately similar to similar to NEDO-4 [H^apiens] 3.85 
I08600 AA099585 Hs.41175 ESTs 3.833 
I03099 X61100 Hs.8248 NADH dehydrogenase (ubiquinone) Fe-S protefo 1 (75kD) (NADH-coenzyme 3.818 
1 34948 H06773 Hs.93850 protein kinase; AMP-activated; gamma 2 non-catalytic subunit 3.792 
120511 AA258144 Hs221576 ESTs 3.779 
111861 R37460 Hs25231 ESTs 3.768 
1 13966 W86600 Hs.9842 ESTs 3.75 
131649 AA481254 Hs.30120 ESTs 3.708 
129775 R94659 Hs.12420 ESTs 3.707 
110191 H20568 Hs27182 phospholipase A2-activating protein 3.7 

! 12678 R87160 Hs.33665 ESTs 3.7 

127115 AA375791 Hs.131894 ESTs 3.674 

132892 W92797 Hs.59378 DKFZP434G1 62 protein 3.653 

1 15023 AA252079 Hs.63931 dachshund (Drosophila) homolog 3.625 

1 14932 AA242751 Hs.16218 KIAA0903 protein 3.62 

106865 AA487228 Hs.19479 ESTs 3.614 

134480 AA024664 Hs.83916 NADH dehydrogenase (ubiquinone) 1 alpha subcomptex; 5 (13kD; B13) A 3.613 

{24780 R42493 Hs220839 ESTs 3.6 

130631 AA025399 Hs. 169737 ESTs 3592 

134154 AA211320 Hs.79404 neuron-specific protein 3568 

104160 AA455706 Hs.99722 ESTs; Weakly similar to 78 KD GLUCOSE REGULATED PROTEIN 

. PRECURSOR 3.559 

105524 AA258158 Hs22153 ESTs; Weakly similar to K1AA0352 [Ksapiens] 3542 

110168 H19673 Hs.176586 ESTs 3.525 

109480 AA233299 Hs.72158 ESTs 3522 

109585 F02367 Hs27252 ESTs 35 

115134 AA257107 Hs.194331 ESTs 35 

116083 AA455653 Hs.44581 ESTs; Weakly similar to HEAT SHOCK 70 KD PROTEIN 6 [Rsapiens] 3.459 

120524 AA261852 Hs.192905 ESTs 3.45 

116932 H74330 Hs.150000 ESTs 3.425 

130746 AA256976 Hs.18800 ESTs; Weakly similar to KIAA0579 protein (Rsapiens] 3.42 

107513 X05451 Hs.158295 Human alkali myosin light chain 3 mRNA; complete cds 3.417 

118641 N70298 Hs.49829 ESTs 3.407 

126584 AI028384 Hs.127331 ESTs 3.399 

105134 AA159953 Hs22895 ESTs; Weakly similar to aryisulfatase B precursor [H.sapiens] 3.325 

123502 AA600116 Hs.1 12526 ESTs 3.318 

132389 N50866 Hs.47135 ESTs 3.317 

105691 AA287097 Hs.75356 transcription (actor 4 3.315 

131505 H85897 Hs27755 ESTs 3.309 

120775 AA342104 Hs.96777 EST 3^ 

105579 AA278824 Hs.19218 ESTs 3.295 

128190 AA946876 Hs.148376 ESTs 3292 

100819 HG402OHT4290 Transglutaminase 3288 

130217 D29956 Hs.152818 ubiquitin specific protease 8 3273 

130068 AA608903 Hs.106220 K1AA0336 gene product 3269 

134719 L07515 Hs.89232 chromobox homolog 5 (Drosophila HP1 alpha) 3266 

1 10277 H29209 Hs.151231 ESTs; Highly similar to FYVE finger-containing phosphoinositide kinase [M.musculus] 326 

127354 AM18880 Hs.185797 ESTs " 3212 

129173 R60523 Hs.109087 ESTs 3.197 

127464 AA970504 Hs.146103 ESTs 3.179 

124923 R94500 Hs.108046 ESTs 3.175 

122465 AA448164 Hs.99153 ESTs; Highly similar to CG1-73 protein [Rsapiens] 3.151 

122027 AA431302 Hs.98721 EST; Weakly similar to N-copine [Hxapiens] 3.151 

103329 X85134 Hs.72984 retinoblastoma-binding protein 5 3.15 

129937 M95767 Hs.135578 chitobiase; di-N-acetyl- 3.15 

134197 AA057341 Hs.87889 helicase-moi 3.15 

107764 AA018219 Hs226923 ESTs 3.125 

121775 AA421773 Hs.161008 ESTs 3.125 

114768 AA149007 Hs.182339 Ets homologous factor 3.12 

132381 N48818 Hs.46884 ESTs 3.11 

123105 AA485973 Hs.143947 ESTs 3.104 

121176 AA400080 Hs.97774 ESTs 3.1 

125053 T80620 Hs,186473 ESTs 3.075 
105909 AA401739 Hs5111 ESTs 3.066 
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119767 W72562 H&58119 ESTs 3.057 

115776 AM24038 H&58197 ESTs 3.056 

111713 R22988 Hs£20950 ESTs 3.05 

115301 AA280O47 Hs.43948 ESTs 3.05 

5 118448 N66412 Hs.49189 ESTs 3 

106586 AA456598 Hs.256269 ESTs 2.995 

110415 H48239 H&29739 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-3A [H.sapiens] 2.979 

105173 AA182O30 Hs.8364 ESTs 2.978 

101102 L07594 Hs.79059 transforming growth factor; beta receptor III (betaglycan; 300kD) 2.976 

10 110543 H58383 Hs.258544 ESTs 2.976 

125593 R24464 Hs.202949 KIAA1 102 protein 2.964 

100824 HG4058-HT4328 Oncogene Amll-EvM, Fusion Activated 2.957 

106822 AA481068 Hs.31835 ESTs 2.95 

131963 D11930 Hs.3592 ESTs 2.95 

15 111221 K68869 Hs.15119 ESTs 2.936 

113620 T93795 Hs.17252 EST 2.917 

105220 AA210695 Hs.17212 ESTs 2.917 

123234 AA490227 Hs.1Q5252 ESTs 2.904 

125250 W87465 Hs.222926 ESTs; Weakly similar to D2092.2 [C.elegans] 2.9 

20 116196 AA465160 Hs.63386 ESTs 2.9 

122100 AA432243 Hs.41086 ESTs; Weakly similar to OXYSTEROL-BINDING PROTEIN [H.sapiens] 2.896 

111712 R22905 Hs.1 13716 ESTs 2.895 

126589 W78107 Hs.187698 ESTs; Weakly simflar to Yer140wp [S.cerevisiae] 2.895 

111132 N64378 Hs.13149 ESTs; Highly similar to unknown function [H^apiens] 2.894 

25 115307 AA280300 Hs.191346 ESTs 2.886 

108989 AA152263 Hs.18827 KIAA0849 protein 2.883 

129486 H03686 Hs.220689 Ras-GTPase-adwating protein SH3^omain-binding protein 2.879 

119805 W73788 Hs.43213 ESTs 2.875 

125721 R59881 Hs.7503 ESTs 2.871 

30 103704 AA028171 Hs.153688 ESTs 2.868 

128420 AI088155 Hs.14146 ESTs; Weakly simHarto unknown [Rsapiens] 2.866 

120571 AA280738 Hs.128679 ESTs 2.863 

.123059 AA482019 Hs.238202 EST 2.86 

129462 D84239 Hs.111732 IgQ Fc binding protein 2.856 

35 125166 W45491 Hs.172609 nucleobindin 1 2.854 

125992 W01626 za36e07.r1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone 2.852 

109431 AA227972 Hs.43635 ESTs 2.85 

105077 AA142919 Hs£558 ESTs 2.847 

131388 R34531 Hs.92200 KIAA0480 gene product 2.846 

40 121080 AA398720 Hs.177953 ESTs 2.838 

112575 R73816 Hs.17385 ESTs 2.836 

130244 R26206 Hs.153293 KIAA07O1 protein 2.825 

134698 AA427783 Hs.77910 34iydroxy-3H^ethylglutaryKJ()eicyme A synthase 1 (soluble) 2.816 

116355 AA504356 Hs.88650 ESTs 2.813 

45 115316 AA280627 Hs£7846 ESTs 2.806 

129677 U48736 Hs.198891 serineVthreonine-protein kinase PRP4 homotog 2.8 

130971 H20332 Hs£8707 signal sequence receptor; gamma (transiocon^ssociated protein p^mma) 2.799 

115054 AA252863 Hs.87729 ESTs 2.795 

130285 AA063546 Hs.202968 ESTs 2792 

50 124308 H93575 Hs^27146 Homo sapiens mRNA; cDNA DKFZp564J142 (from clone DKFZp564J142) 2.783 

125502 AA732329 Hs.191959 ESTs 2.778 

114800 AA159825 Hs.131887 ESTs; Weakly slmBar to ORF YNL227c [S.oerevlsiae] 2.768 

128625 AA242816 Hs.102652 ESTs; Weakly similar to KIAA0437 [H.sapiens] * 2.766 

130159 H51098 Hs.151310 PDZ domain protein (Drosophlla InaD-Bke) 2.75 

55 107127 AA620504 Hs£2119 ESTs 2.742 

113547 T90746 Hs.15233 ESTs 2.734 

104639 AA004622 Hs.18214 ESTs 2.727 

127609 AA622559 Hs.150318 ESTs 2.726 

106922 AA490964 Hs.1 0056 ESTs 2.725 

60 124825 R52088 yg85c3.s1 Soares infant brain 1NIB Homo sapiens cDNA clone 2.725 

124333 H98683 Hs.154054 ESTs 2.708 

117634 N36421 Hs.107854 ESTs; Weakly similar to SODIUM- AND CHLORIDE-DEPENDENTGLYCINE 

TRANSP 2.706 

101609 M54927 Hs.1787 proteolipid protein 1 (Pelizaeus-Merzbacher disease; spastic paraplegia 2; 

65 uncomplicated) -2.704 

117142 H96908 Hs.42251 ESTs 2.7 

112602 R79147 Hs*03365 ESTs 2.695 

106828 AA481505 Hs.13797 ESTs 2.68 

124377 N25996 Hs.1 79833 ESTs 2.675 
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101026 J04970 carboxypeptidase M 2.675 

124560 N66393 * Hs.102754 ESTs 2.675 

124066 H02494 Hs.101615 ESTs 2.671 

130281 R12777. Hs.15395 ESTs; Weakly similar to ARGINYL-TRNA SYNTHETASE [H.sapiens] 2.66 

110949 N49602 Hs.13308 ESTs 2.65 

111031 N54B39 Hs.221085 ESTs; Highly similar to mediator [H^apiens] 2.633 

121770 AA421714 Hs.11469 KIAA0896 protein 2.63 

134132 U32519 Hs220689 Ras-GTPase-activating protein SH3-domain-binding protein 2.626 

112424 R62452 Hs.191265 ESTs 2.625 

122544 AA451679 Hs.194410 ESTs 2.625 

134425 X90568 Hs.172004 titin 2.624 

111114 N63391 Hs.9238 ESTs 2.619 

116119 AA459242 Hs.44445 ESTs; Weakly similar to Ketch motif containing protein [Ksapiens] 2.615 

112079 R44164 Hs^3014 ESTs 2.6 

123033 AA481271 Hs.193945 ESTs 2.591 

124196 H52617 Hs.144167 ESTs 2586 

125873 H14437 yf25a04.r1 Soares breast 3NbHBst Homo sapiens cDNA done 2.58 

117684 N40184 Hs.45050 ESTs 2.575 

134938 030037 Hs.168326 phosphotidyiinositol transfer protein; beta 2575 

131822 AA215647 Hs.200332 ESTs 2368 

135185 U71203 Hs.96038 Ric (Drosophila)-like; expressed in many tissues . 2564 

117690 N40467 Hs.93834 ESTs ^ 2557 

118807 N78582 Hs.50732 protein kinase; AMP-activated; beta 2 non-catalytic subunit 2552 

121369 AA405657 Hs.128791 Human DNA sequence from clone 967N21 on chromosome 20p12.3-13. Contains 2.55 

114860 AA235112 Hs.106227 ESTs; Moderately similar to similar to murine RNA-binding protein [Ksapiens] 2549 

121857 AA426017 Hs.62694 ESTs; Highly similar to DNA-REPAIR PROTEIN COMPLEMENTING 2548 

110190 H20560 H&244624 ESTs 2548 

132573 AA045333 Hs51743 ESTs; WeaMy similar to "ALU SUBFAMILY SB2 WARNING ENTRY 11 [Ksapiens] 2.542 

109706 F09729 Hs.12780 ESTs 2537 

135109 AA410391 Hs.94592 Wotho 2525 

132810 R37027 Hs5737 K1AA0475 gene product 2525 

124879 R73588 Hs.101533 ESTs 2525 

103840 AA174190 Hs50932 ESTs 2525 

119066 R22196 Hs.34492 ESTs 2519 

114833 AA234362 Hs.87310 ESTs; Moderately similar to CGh66 protein (H.sapiens] 2507 

112998 T23555 Hs.103288 ESTs 2.5 

123312 AA496258 Hs.99601 ESTs 2.499 

121873 AA426270 Hs.145696 splicing factor (CC1. 3) 2.491 

123321 AA496884 Hs.23972 ESTs 2.491 

107760 AA018042 Hs.95078 EST 2.483 

102580 U60808 Hs.152981 CDP-diacyffllyceroJ synthase (phosphafeiate cytldylyttransferase) 1 £481 

103053 X56741 Hs5947 mel transforming oncogene (derived from cell line NK14)- RAB8 homolog 2.475 

124756 R38100 Hs.106294 ESTs 2.475 

112936 T15665 Hs.6185 ESTs; Weakly slmflar to BcDNA.GH12174 p jnelanogaster] 2.475 

125178 W58202 Hs.125731 ESTs 2.475 

112423 R62447 Hs£2123 ESTs 2.471 

123515 AA600323 Hs.1 12535 EST 2.462 

102842 U95020 Hs.21903 calcium channel; voltage-dependent; beta 4 subunit 2.457 

102400 U42390 Hs.171957 triple functional domain (PTPRF interacting) 2.455 

113187 T56056 Hs.9992 ESTs 2.452 

131687 L11066 Hs.3069 heat shock 70kD protein 9B (mortalin-2) 2.448 

115314 AA280583 Hs.256501 ESTs 2.437 

128211 A1206427 Hs.166707 ESTs; Highly slmflar to Ran-bindlng protein 2 [H.saplerrs] 2.43 

134281 L1 1005 Hs.81047 aldehyde oxidase 1 2.425 

115985 AA447709 Hs.132094 ESTs; Moderately similar to putative transcription factor CA150 [H.saplens] 2.425 

111348 N90041 Hs.9585 ESTs 2.418 

129430 AA258842 Hs.197677 Homo sapiens clone 23777 putative transmembrane GTPase mRNA; partial cds 2.418 

133863 C13990 Hs.76930 synucleln; alpha (non A4 component of amyloid precursor) 2.417 

111164 N66857 Hs.14808 ESTs; Weakly similar to II ALU CLASS C WARNING ENTRY II [H.sapiens] 2.416 

132143 AA257056 Hs.7972 K1AA0871 protein 2.412 

130330 M55047 Hs.154679 synaptotagmin 1 2.408 

114219 Z39451 Hs.27389 ESTs 2.406 

117101 H94043 Hs.24341 0KFZP5861 1419 protein 2.403 

125433 AA034325 Hs.54320 ESTs 2.4 

111099 N62506 Hs.21958 ESTs 2.4 

120323 AA1 95405 Hs.1 10347 Homo sapiens mRNA for alpha integrin binding protein 80; partial 2.397 

118624 N69998 Hs21801 ESTs 2.394 

123570 AA608955 Hs.109653 ESTs 2589 

123562 AA608893 Hs.190065 ESTs 2.388 
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131546 AA262821 Hs.28578 musdeblind (Drosophila)-like 2.385 

103143 X66141 Hs.75535 myosin; light polypeptide 2; regulatory; cardiac; slow 2.384 

123646 AA609310 Hs.1 88691 ESTs 2.383 

130123 AAD01835 Hs.150390 zinc finger protein 262 2-379 

131682 AA428368 Hs.30654 ESTs 2.378 

115909 AA436666 Hs.59761 ESTs 2.375 ^ 

125168 W45574 Hs252497 ESTs 2.372 

123973 C14805 Hs.182151 ESTs 2.361 

135197 U76456 Homo sapiens tissue inhibitor of metalloproteinase 4 mRNA, complete cds 2.357 

118689 N71545 Hs.184544 ESTs 2.357 

107734 AA016225 Hs.93386 ESTs 2.354 

124590 N69220 Hs.41381 ESTs; WeaWy simitar to ublquifin hydrolyzlng enzyme I [H^apiens] 2.35 

111163 N66850 Hs.17606 ESTs 2.348 

112349 R58877 Hs22665 ESTs; Moderately similar to d J83L6.1 [H.sapiens] 2.345 

129076 AA262179 Hs.169343 ESTs 2.345 

134238 R81509 Hs.184571 splicing factor; arginine/serine-rich 1 1 2.341 

116766 H13260 Hs.95097 ESTs 2.336 

106331 AA436853 Hs.34795 ESTs 2.333 

129003 AA443752 Hs.10784 ESTs 2.332 

132368 AA599814 Hs.46637 ESTs; Weakly similar to cDNA EST yk289g5i> conies from this gene [Celegans] 2.332 

124697 R06273 Hs.186467 ESTs; Modly smlr to H ALU SUBFAMILY J WARNING ENTRY II [H.sapiens] 2.322 

120273 AA176688 Hs221139 ESTs 2.313 

127110 AA304993 Hs.100861 ESTs; Weakly similar to p60 katanin IH.sapiens] * 2.307 

105450 AA252621 Hs.93842 ESTs 2.301 

119819 W74371 H&58383 ESTs 2297 

102302 U33052 Hs.69171 protein kinase C-!ike 2 2288 

130596 N74353 Hs.16475 ESTs 2282 

114161 Z38904 Hs22385 ESTs; Weakly similar to K1AA0970 protein [H^apiens] 2278 

130542 U64675 Human sperm membrane protein BS-63 mRNA 1 complete cds 2277 

104491 N71513 Hs.39328 ESTs 2275 

116988 H82527 ys69e12.s1 Soares retina N2b4HR Homo sapiens cDNA clone 2275 

126823 AA370120 Hs.7870 ESTs; Weakly similar to YIr350wp [S.cerevisiae] 2273 

108800 AA129731 Hs.90424 ESTs 2273 

101310 L41607 Hs.934 glucosamine (N-acetyl) transferase 2; ^-branching enzyme 2269 

126842 W19498 Hs21085 ESTs 2.255 

127251 AA936428 Hs.128638 ESTs 2251 

124647 N91947 Hs.125033 ESTs 2249 

127112 AI143906 Hs.125103 ESTs 2247 

101973 S82597 Hs.80120 UDP-N-acetyl-alpha-[>galactosamlne:polypeptide 2246 

120999 AA398302 Hs.127437 ESTs 2245 

130225 AA599583 Hs.15299 HMBA-inducible 2243 

119980 W88678 Hs249247 heterogeneous nuclear protein similar to rat helix destabilizing protein 2243 

124222 H61053 Hs222844 ESTs 224 

129199 H90914 Hs.128629 ESTs 2236 

106802 AA479101 Hs.16570 ESTs; Weakly similar to I! ALU SUBFAMILY SQ WARNING ENTRY !! (H.sapiens] 2231 

126160 N90960 Hs247277 ESTs; Weakly similar to transformation-related protein [H .sapiens] 2229 

104627 AA001976 Hs.19603 ESTs 2228 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from clone OKFZp564C053) 2226 

113096 T40927 Hs.8345 ESTs 2225 

135336 AA452822 Hs.99027 ESTs 2225 

135344 R62976 Hs.168491 ESTs; Moderately similar to TRF1 -Interacting ankyrin-related 2225 

126156 AA508354 Hs.118448 ESTs; Moderately similar to AKT3 protein kinase [H.sapiens] 2222 

128885 AA397841 Hs.180141 cofilin 2 (muscle) - 2218 

107900 AA026385 Hs.176600 ESTs; Moderately similar to !! ALU SUBFAMILY SB2 WARNING 2217 

114481 AA033562 Hs.151572 ESTs 2212 

109292 AA199828 Hs.188662 ESTs 2212 

104257 AF006265 Hs.9222 estrogen receptor-binding fragment-associated gene 9 2209 

132932 T15482 Hs.6093 ESTs " 2204 

127392 AA262728 Hs.14896 Homo sapiens clone 24590 mRNA sequence 2204 

104641 AA004652 Hs.18564 ESTs 22 

122529 AA449828 Hs.99229 ESTs 2.195 

124307 H93562 Hs.162395 proline synthetase co-transcribed (bacterial homolog) 2.193 

133601 S95936 Hs.75155 transferrin 2.193 

119904 W85709 Hs.128927 ESTs; Weakly similar to II ALU SUBFAMILY SP WARNING ENTRY I! [H.sapiens] 2,192 

100348 D64109 Hs.4994 transducer of ERBB2; 2 (TOB2) 2.185 

126871 AA351779 Hs200334 ESTs 2.18 

127793 AI298835 Hs.30445 ESTs; Weakly similar to transcription regulator Staf-50 [H^apiens] 2.178 

105149 AA169253 Hs.8958 ESTs 2.177 
121367 AA405648 zw39g8.s1 SoaresJotaUetus_Nb2HF8_9w H sapiens cDNA clone IMAGE:772478 2.177 
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111836 R36228 H&25119 ESTs 2.175 

133394 R16759 Hs.237225 ribosomal protein S5 pseudogene 1 2.175 

123207 AA489697 Hs.145053 ESTs 2.175 

129801 F11087 Hs.239666 ESTs 2.175 

5 103393 X94612 Hs.41749 protein kinase; cGMP-dependent; type II 2.161 

132415 AA043223 Hs.4815 nudix (nucleoside diphosphate Bnked moiety X)-type motif 3 2.157 

106369 AA443828 Hs.25324 ESTs 2.157 

122963 AA478446 Hs.69559 KIAA1 096 protein 2.156 

133473 M19309 Hs.73980 troponin T1 ; skeletal; slow 2.155 

10 134257 C06270 Hs.8078 Homo sapiens mRNA; cDNA DKFZp586L081 (from clone DKFZp586L081) 2.155 

135156 AA056012 Hs.9552 binder of Arl Two 2.151 

104055 AA393755. Hs.117211 ESTs; Highly similar to CGI-62 protein [H^aplens] 2.15 

102313 U33921 HSU33921 Clontech adult lung cDNA library (HL1 158a) Homo sapiens cDNA 2.15 

109768 F10638 Hs.12432 Homo sapiens clone 24407 mRNA sequence 2.15 

IS 103507 Y10032 Hs.159640 serum/glucocorticoid regulated kinase 2.15 

116000 AA448710 Hs.41327 ESTs 2.15 

105858 AA399164 Hs.227676 ESTs; Moderately similar to U ALU SUBFAMILY SQ 2.137 

103153 X66534 Hs.75295 guanyfate cyclase 1; soluble; alpha 3 2.137 

126202 AA652238 Hs.199726 ESTs 2.135 

20 115955 AA446121 Hs.44198 Homo sapiens BAG clone RG054OO4 from 7q31 2.134 

104164 AA458770 Hs^7023 KIAA0917 protein 2.132 

108692 AA121270 Hs.82960 ESTs 2.128 

122878 AA465341 Hs.99640 ESTs 2.126 

134771 L13939 Hs.89576 adaptor-related protein complex 1; beta 1 subunit 2.125 

25 104298 D31120 Hs.40368 adaptor-related protein complex 1 ; sigma 2 subunit 2.125 

104840 AA039595 Hs.42458 Homo sapiens mRNA; cDNA DKFZp586C1817 (from clone DKFZp586C1817) 2.125 

122180 AA435798 Hs.98835 ESTs; Moderately simflar to putative ring zinc finger protein 2.125 

131012 H01992 Hs.202949 KIAA1 1 02 protein 2.125 

134092 H17490 Hs.7905 ESTs; Highly similar to sorting nexln 9 [H.sapiens] 2.123 

30 118617 N69666 Hs.18341 3 ESTs; Modtly smlr to H ALU SUBFAMILY J WARNING ENTRY I! [Rsapiens] 2.123 

107155 AA621202 Hs.7946 DKFZP586D151 9 protein 2.12 

130925 N71935 Hs.169378 multiple POZ domain protein 2.12 

135167 U63717 Hs.95821 osteoclast stimulating factor 1 2.118 

105952 AA405263 Hs.181400 ESTs 2.109 

35 110308 H38148 Hs.32775 ESTs 2.108 

116368 AA521186 Hs.94217 ESTs 2.107 

132939 U76189 Hs.61152 exostoses (muftipleHike 2 2.102 

117881 N50073 Hs.84926 ESTs; Highly similar to B-IND1 protein [Mmusculus] 2.1 

121723 AA419622 Hs.104800 ESTs; Weakly similar to Mouse 195 mRNA; complete cds [M.musculus] 2.096 

40 103500 Y09443 Hs.22580 atkylglycerane phosphate synthase 2.094 

121429 AA406293 Hs.193498 ESTs 2.093 

134632 AA398710 Hs.174139 chloride channel 3 2.091 

129785 F10980 Hs.184780 ESTs 2.09 

111065 N58193 Hs.18740 ESTs; Weakly similar to 1-evidence 2.089 

45 114710 AA129931 Hs.79081 protein phosphatase 1; catalytic subunit; gamma isoform 2.083 

132711 N73702 Hs^38927 ESTs 2.083 

133377 R05490 Hs.7239 SEC24 (S. cerevisiae) related gene family; member B 2.079 

124773 R40923 Hs.106604 ESTs 2.078 

117759 N47587 Hs.97345 ESTs; Weakly similar to TROPOMODULIN [Haptens] 2.076 

50 127386 AI457411 Hs.106728 ESTs 2.076 

101167 L15309 Hs.193677 zinc finger protein 141 (clone pHZ44) 2.075 

109597 F02582 Hs.14474 ESTs 2.074 

124390 N29325 Hs.7535 ESTs; Highly simflar to COBW-iike placental protein [Hrsapiens] 2.07 

116225 AA478609 Hs.47278 Human Chromosome 16 BAG clone CIT987SK-A-735G6 2.07 

55 131243 R16667 Hs.24752 spectrin SH3 domain binding protein 1 2.069 

130557 T90830 Ks.15981 ESTs; Weakly similar to line-1 protein ORF2 [H.sapiens] 2.067 

134103 D14826 Hs.155924 cAMP responsive element modulator 2.064 

108833 AA131866 Ks.61661 ESTs; Weakly similar to DY3.6 [C.elegans] 2.063 

112286 R53765 Hs.1581 35 KIAA0981 protein 2.063 

60 125624 AA165411 zq49a01/1 Stratagene hNT neuron (#937233) Homo sapiens cDNA done 2.061 

124612 N72200 Hs.13913 ESTs 2.058 

116335 AA495830 Hs.87013 ESTs 2.057 

112248 R51361 Hs.23423 ESTs 2.G56 

115789 AA424754 Hs.43149 ESTs 2.056 

65 107029 AA599219 Hs.187492 ESTs; WeaWy similar to ALR [H^apiens] 2.056 , 

110294 H30270 Hs.165062 ESTs 2.054 

120532 AA262354 Hs.186648 ESTs 2.054 

118180 N59249 Hs.48349 ESTs 2.052 

132018 AA293194 Hs.3737 ESTs 2.052 
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132617 AA171913 Hs.5338 carbonic anhydrase XII 2.05 

131526 N36167 Hs.26274 ESTs 2.05 

113254 T64438 Hs.11449 DKFZP5640123 protein 2.05 

122765 AA459978 H&99508 ESTs 2.05 

107203 D20426 H&5656 EST 2.05 

105713 AA291321 Hs.1 8431 9 ESTs; Moderately similar to K1AA1 006 protein [H.sapiens] 2.046 

129365 D82675 Hs.110950 Homo sapiens clone 25007 mRN A sequence 2.042 

119116 R43845 Hs.64595 DKFZP566E2346 protein 2.04 

116405 AA600253 Hs.55601 ESTs; Highly similar to host cell factor 2 [H.sapiens] 2.04 

125924 AA526849 Hs.82109 syndecanl 2.039 

105599 AA279442 Hs.1 43460 protein kinase C; nu 2.037 

119741 W70205 Hs.43670 kinesin family member 3A 2.037 

101449 M21494 Hs.11 8843 creatine kinase; muscle 2.036 

107109 AA609943 Hs.32793 ESTs 2.034 

117040 H89112 yw25e5.s1 Morton Fetal Cochlea Homo sapiens cDNA clone IMAGE25328 2.034 

132906 AA142857 Hs.234896 ESTs; Highly similar to gemintn (H.sapiens] 2.031 

105479 AA255546 HS23467 ESTs 2.027 

102031 U04898 H&2156 RAR-related orphan receptor A 2.027 

119846 W80363 Hs.58446 ESTs 2.024 

124809 R46482 Hs.106875 ESTs 2.024 

130286 AA041548 Hs.154023 WAA0573 protein 2.023 

124457 N50114 Hs.128704 ESTs 2.017 

125144 W37999 Hs£4336 ESTs 2.017 

120581 AA281257 Hs.125868 ESTs 2.014 

104931 AA062731 Hs.1 08319 thyroid hormone receptor-associated protein; 150kDasubunit 2.012 

120548 AA278846 Hs.187634 ESTs 2,011 

113933 W81362 Hs.30567 ESTs 2,011 

123072 AA4S5041 Hs.104308 ESTs 2.009 

123648 AA609323 Hs.1 12689 ESTs 2.008 

116875 H67749 Hs.161022 EST 2.003 

103179 X69398 Hs.82685 CD47 antigen (Rh-related antigen; Integrln-associated signal transducer) 1.995 

103478 Y07755 Hs.38991 S100 calcium-binding protein A2 1.995 

111007 N53378 Hs.22543 ESTs 1.995 

120470 AA251797 zs11f3.s1 NCI_CGAP_GCB1 Homo sapiens cDNA done 1.989 

112280 R53457 HsJ26040 ESTs; Weakly similar to fatty acid omega-hydroxyiase [H.sapiens] 1.989 

114127 Z38652 Hs.106961 ESTs; Weakly similar to TYL fltsapiens] 1.988 

129863 AA151005 Hs.129872 sperm surface protein 1.988 

106320 AA436608 ESTs 1.988 

108933 AA147224 Hs.71814 ESTs 1.986 

105906 AA401633 Hs.22380 ESTs 1.982 

109029 AA157911 Hs.72200 ESTs 1.982 

118470 N66769 Hs.82781 ESTs 1.975 

115358 AA281886 Hs.88923 ESTs 1.975 

115257 AA279060 Hs.193516 B-cell CLMymphoma 10 1.974 
126879 AA719776 zh38g04.s1 Soares_plneaLglan(LN3HPG Homo sapiens cDNA clone IMAGE414390 1 .974 

109547 F01479 Hs.26966 ESTs 1.973 

127111 AA805726 Hs.220509 ESTs 1.969 

101266 L36645 Hs.73964 EphA4 1.966 

129319 AA037467 Hs^0340 ESTs 1.965 

106211 AA428240 Hs.126083 ESTs 1.962 

112753 R93696 Hs.169882 ESTs 1.961 

120489 AA255538 Hs.190504 ESTs 1.959 

129699 AA458576 Hs.12017 KIAA0439 protein; homobg of yeast ubiquitin-protein ligase Rsp5 1.956 

105425 AA251129 Hs24416 ESTs 1.953 

134740 L37362 Hs^9455 opioid receptor; kappa 1 1.95 

109324 AA210700 Hs.86405 Homo sapiens mRNA; cDNA DKFZp564P056 (from clone DKFZp564P056) 1.95 

124303 H93043 Hs.107070 ESTs 1.95 

102337 U36922 Human fork head domain protein (FKHR) mRNA, 3' end 1.948 

109441 AA228100 Hs.86998 nuclear factor of activated T-ce lis 5 1.946 

127364 AA179573 Hs.90061 progesterone binding protein 1.942 

105255 AA227498 Hs.3623 ESTs 1.942 

130672 L19783 Hs.177 phosphatidytinositol glycan; class H 1.942 

104301 D45332 Hs.6783 ESTs 1.94 

132442 R62589 Hs.167419 ESTs 1.939 

105519 AA258063 Hs.23438 ESTs 1.937 

132902 AA490969 Hs.168147 ESTs 1.936 

118873 N89881 Hs.44577 ESTs 1.936 

114124 Z38595 Hs.125019 ESTs; Highly similar to KIAA0886 protein [H^apiens] 1.934 

115075 AA255486 Hs38045 ESTs 1.933 
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110695 H93463 Hs.124777 ESTs 1-931 

105360 AA236209 Hs.187628 ESTs 1.931 

124998 T56013 Hs.77910 3-hydroxy-3-methyIglutaryl-Coenzyme A synthase 1 (soluble) 1.929 

121816 AA424814 Hs.187509 ESTs 1.927 

111717 R23241 Hs.110776 STAT induced STAT lnhibitor-2 1.925 

128874 H06245 Hs.106801 ESTs 1.925 

109391 AA219699 Hs.184245 KIAA0929 protein Msx2 interacting nuclear target (MINT) homobg 1.913 

126129 H82165 Hs.40334 ESTs 1.911 

115553 AA369027 Hs.71414 ESTs 1.905 

113811 W44928 Hs.4878 ESTs 1.905 

108345 AA070906 zm66d1.s1 Stratagene neuroepithefium (#937231) Homo sapiens cDNA clone 1.904 

120472 AA251875 Hs.104472 ESTs; Weakly similar to Gag-Pol polyproteln [Mmusculus] ■ 1.903 

116602 D80063 H&241673 EST 1.901 

121 121 AA399371 Hs.189095 ESTs; Weakly similar to zinc finger protein SALL1 [H.sapiens] 1 .9 

125330 AA401804 Hs.1 14574 ESTs 1396 

130095 F01831 Hs.14638 ESTs 1.894 

119782 W72982 Hs.58262 ESTs 1.894 

104115 AA428090 HS.261Q2 ESTs 1.893 

131313 C17938 Hs.22370 Homo sapiens mRNA; cDNA DKFZp564O0122 (from clone OKFZp564O0122) 1.891 

105583 AA278907 Hs24549 ESTs 1391 

122825 AA461195 Hs.99580 ESTs 1.887 

119495 W35390 Hs.55533 ESTs 1386 

130309 AA134289 Hs.15423 Homo sapiens BAG clone RG114B19 from 7q31.1 1.886 

125628 AA418069 HsJ241493 natural kOter-tumor recognition sequence 1386 

110611 H66947 Hs.14671 ESTs; Highly similar to gene ERCC5 protein (H^apiens) 1.885 

117301 N22569 Hs.43215 ESTs 1.884 

131406 N92239 Hs£6471 Wnt inhibitory factor-1 1381 

126428 AA013312 Hs.64988 ESTs 1381 

120285 AA182882 Hs.111110 titin-cap (telethonin) 1378 

112724 R91753 Hs.17757 ESTs 1378 

103121 X63679 Hs.4147 translocating chain-associating membrane protein 1 .875 

124381 N26765 Hs.109008 ESTs 1375 

117226 N20468 Hs.177322 ESTs; WeaWy similar to putative p150 (H.sap»ens] 1375 

105610 AA279991 Hs.124691 ESTs; Weakly similar to trithorax homotogue 2 [H^apiens] 1375 

111229 N69113 Hs.1 10855 ESTs 1375 

120627 AA285079 Hs.190474 ESTs 1373 

107048 AA600012 Hs.106G9 ESTs; Moderately similar to KIAA0400 [H.sapiens] 1372 

104041 AA381902 Hs.197114 RNA binding protein 1372 

115162 AA258366 Hs.227806 ras GTPase activating protein-like 1372 

102239 U26726 Hs.1376 hydroxysterold (11 -beta) dehydrogenase 2 1.87 

1 00043 M10O98 AFFX control: 18S ribosomal RNA 1368 

120296 AA191353 Hs£2385 ESTs; Weakly similar to KIAA0970 protein [H^apiensj 1367 

129011 S72869 Hs.107932 DNA segment; single copy; probe pH4 (transforming sequence; thyrokM; 1367 

134851 R44479 Hs.90232 KIAA0552 gene product 1.866 

117392 N26175 Hs.93405 ESTs 1364 

114530 AA053027 Hs.191797 ESTs 1363 

123541 AA608794 Hs.112592 ESTs 1363 

124890 R78618 Hs.34145 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-8 [H^apiens] 1362 

I05299 AA233511 Hs.194720 ATP-binding cassette; sub-family G (WHITE); member 2 1361 

103560 220656 Hs.182787 myosin; heavy polypept 6; cardiac muscle; alpha (cajdiwnyopathy; hypertrophic 1) 1.861 

113073 T33637 Hs.6841 ESTs 136 

120407 AA235040 Hs.107283 ESTs 1.859 

I03892 AA243523 Hs.17155 ESTs - 1.858 

I23795 AA620381 Hs.70488 ESTs 1.857 

108524 AA084323 Hs.68138 ESTs 1.857 

113953 W85812 Hs.187554 ESTs 1.856 

110721 H97678 Hs31319 ESTs 1356 

I29426 AA412087 Hs.168272 EST; Highly smlr to prot Inhibitor of activated STAT prot PIASx-alpha [H.sapiens] 1353 

112102 R44840 Hs.21303 ESTs 1352 

118502 N67317 Hs.50150 ESTs O 1.852 

107619 AA004955 Hs.60015 ESTs 1.851 

100436 D87446 Hs,75912 KIAA0257 protein 1.85 

I20652 AA287312 Hs.191648 ESTs 1.85 

121643 AA417078 Hs.193767 ESTs 1.843 

117387 N26011 Hs33810 ESTs 1343 

132084 Y12394 Hs.3886 karyopherin alpha 3 (importin alpha 4) 1343 

I24449 N48593 Hs.121820 ESTs 1341 

120263 AA173440 Hs.193919 ESTs 1.838 

I27226 AA731036 Hs3463 ribosomal protein S23 1338 
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111837 R36447 Hs.24453 ESTs 1.835 

128727 M64174 H$30651 Janus kinase 1 (a protein tyrosine kinase) 1.834 

114439 AA018937 Hs.128629 ESTs 1333 

102332 U35637 Human nebulin mRNA, partial cds 133 

126579 W72979 Hs.146082 ESTs 133 

102341 U37122 Hs.8110 adducin 3 (gamma) 1.83 

114246 Z39848 Hs.12079 ESTs 1328 

131757 D17532 Hs316 DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 6 (RNA helicase; 54kD) 1323 

108904 AA136521 Hs.71148 ESTs; Weakly similar to putative p150 [H.sapiens] 1323 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from done DKFZp564C053) 1.823 

131957 AA609008 Hs.183232 ESTs 1.822 

100131 D12485 Hs.11951 phosphodiesterase l/nudeotide pyrophosphatase 

1 (homologous to mouse Ly-41 antigen) 1 322 

124163 H30539 Hs.189838 ESTs 1321 

118204 N59859 Hs.48443 ESTs 1.821 

107727 AA016021 Hs.1 73091 DKFZP434K1 51 protein 132 

100357 D78156 Hs.241548 RASp21 protein activator 2 1.82 

116295 AA489016 Hs.91216 ESTs; Highly similar to partial CDS; human putative tumor suppressor [H^apiens] 1.82 

124833 R54112 Hs.128697 ESTs 1.817 

122587 AA453255 Hs.6968 ESTs 1317 

114359 Z41589 Hs.153483 ESTs; Moderately similar to H1 chloride channel [Rsapiens] 1315 

111289 N72253 Hs.238246 ESTs 1313 

110826 N30068 Hs.15347 ESTs 1312 

104106 AA422123 Hs.42457 ESTs 1311 

130043 AA055404 Hs.193953 ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY I! [H.sapiens] 1253 

115864 AA432080 Hs31200 ESTs 131 

129737 AA056140 Hs.122684 ESTs 131 

124477 N53158 Hs.102682 ESTs 1309 

100782 HG374O-HT4010 Basic Transcription Factor 2, 34 Kda Subunit 1306 

106101 AA421053 Hs.34395 ESTs 1306 

1 15479 AA287596 zs52h09.s1 NCLCGAP_GCB1 H sapiens cDNA done IMAGE:701 153 1 .804 

116104 AA456635 Hs.78524 ESTs 1.804 

114173 Z39050 Hs.21963 ESTs . 1304 

132632 N59764 Hs.5398 guanine-monophosphate synthetase 1.803 

1191% R49548 Hs.169681 death effector domain-containing 1.802 

131559 N91087 Hs.28728 ESTs; Weakly similar to F55A12.9 [Celegans] 1.801 

126922 AA177138 Hs.161671 ESTs 13 

117375 N25427 Hs.108812 ESTs 1.8 

103571 225535 Hs.211608 nudeoporin 153kD 13 

105978 AA406367 Hs.15973 ESTs 13 

125904 H22372 Hs.163586 ESTs 1.799 

133883 AA397915 Hs.77221 choline kinase 1.798 

105777 AA348412 Hs.23096 ESTs 1.797 

110166 H19480 Hs.174309 ESTs 1.796 

105038 AA130273 Hs.7584 ESTs; Weakly similar to hypothetical protein; similar to [H^aplens] 1.796 

105427 AA251330 Hs£8248 ESTs 1.795 

115278 AA279757 Hs.67466 ESTs; Weakly similar to BACN32G1 1.d [Djnelanogaster] 1.794 

133104 L13698 Hs.65029 growth arrest-specific 1 1.794 

131170 N48674 Hs.23796 Human DNA sequence from done 1052M9 on chromosome Xq25. Contains the 1.792 

100136 D13540 Hs22868 protein tyrosine phosphatase; non-receptor type 11 1.791 

127263 AA331157 EST35035 Embryo, 6 week, subtracted (total cDNA) I Homo sapiens cDNA 1.79 

114157 Z38878 Hs.24979 ESTs 1.79 

125601 AI096717 Hs.247043 KIAA0525 protein - 1.788 

118472 N66818 Hs.42179 ESTs 1.787 

112456 R63925 Hs.28464 ESTs 1.787 

130236 N69682 Hs.51957 SC35-interacting protein 1 1.786 

133297 AA600057 Hs.70266 KIAA09Q5 protein 1.784 

125650 R40096 Hs.176578 ESTs 1.784 

132056 T89386 Hs.38176 KIAA0606 protein; SCN Circadian Oscillatory Protein (SCOP) 1.783 

129093 AA262710 Hs.1 08614 KIAA0627 protein 1.783 

123176 AA489020 Hs.193424 ESTs 1.782 

106340 AA441792 Hs.22&57 chord domain-containing protein 1 1.781 

100598 HG2463-HT2559 Guanine Nudeotide-Binding Protein G25k 1.779 

104038 AA374532 EST86676 HSC172 cells I Homo sapiens cDNA 5* end, mRNA sequence 1 .778 

122235 AA436475 Hs.190104 ESTs 1.777 

105104 AA151771 Hs.76941 ATPase; NaWK+ transporting; beta 3 polypeptide 1.776 

107601 AA004636 Hs30223 ESTs 1.776 

131467 W68255 Hs.27194 DKFZP434K1 71 protein 1.776 

118449 N66413 Hs.172466 ESTs; Weakly similar to KIAA0775 protein [H^apiens] 1.776 
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107969 


AA034030 


Hs.155212 methylmalonyl Coenzyme A mutase 


1.775 


115527 


AA342079 


Hs.252055 ESTs 


1.775 


132471 


T16305 


Hs.49349 beta-site APP-deaving enzyme 


1.775 


105966 


AA406105 


Hs.5344 adaptor-related protein complex 1; gamma 1 subunft 


1.774 


127546 


AA373091 


Hs.93832 Homo sapiens done 24483 unknown mRNA; parital cds 


1.774 


106217 


AA428379 


Ha24870 ESTs 


1.773 


131214 


N26777 


Hs.172635 ESTs 


1.773 


106295 


AA435664 


H&8583 similar to APOBEC1 


1.773 


106328 


AA436705 


Hs.28020 KIAA0766 gene product 


1.772 


124661 


N93797 


Hs.3090 EphB1 


1.772 


122988 


AA479166 


Hs.105633 ESTs 


1.772 


115504 


AA291946 


Hs.42736 ESTs 


1.771 


105168 


AA180208 


Hs.16606 ESTs; Highly similar to CGI-32 protein [H.sapiens] 


1.767 


129153 


AA188618 


Hs.181461 ariadne; Drosophiia; homolog of 


1766 


105829 


AA398290 


Hs£1965 ESTs 


1764 


101811 


M86917 


Hs.24734 oxysteroi binding protein 


1.764 


100138 


D13628 


Hs.2463 angiopoietin 1 


1.764 


124704 


R07335 


ye96cU1 Soares fetal Twer spleen 1NFLS Homo sapiens cDNA clone 


1763 


122314 


AA442257 


Hs.192076 ESTs 


1.762 


109865 


H02566 


Hs.191268 Homo sapiens mRNA; cDNA DKFZp434N174 (from done DKFZp434N174) 


1761 


106206 


AA428069 


Hs.89519 KIAA1046 protein 


1.758 


107135 


AA620782 


Hs.23247 ESTs 


1757 


105760 


AA338960 


Hs£8170 ESTs 


1756 


106288 


AA435536 


Hs.24336 ESTs 


1756 


103968 


AA304566 


Hs.3542 ESTs 


1756 


129559 


AA234945 


Hs.11360 ESTs 


1.756 


117885 


N50112 


Hs.47023 ESTs 


1754 


107032 


AA599472 


Hs.247309 sucdnate-CoA ligase; GDP-forming; beta subunit 


1754 


124807 


R45963 


Hs.23381 1 ESTs; Weakly similar to ORF2 [MjtiuscuIus] 


1753 


100276 


D42047 


Hs.82432 KIAA0089 protein 


1753 


110924 


N47938 


yy84a09.s1 Scares_multiple_sderosis_2NbHMSP Homo sapiens cDNA clone 


1751 


133002 


AF006082 


Hs.62461 ARP2 (actin-related protein 2; yeast) homolog 


1.751 


132530 


AA455917 


Hs.50785 SEC22; veside trafficking protein (S. cerevisiaeHike 1 


1.75 


110759 


N21671 


Hs.19025 ESTs 


1.75 


106138 


AA424515 


Hs.33264 ESTs 


1.75 


107348 


U43701 


Hs.184776 ribosomal protein L23a 


1.75 


115867 


AA432162 


Hs.165986 DKFZP586B2022 protein 


1749 


135398 


AA1 94075 


Hs.99908 nudear receptor coactrvator 4 


1747 


113783 


W19222 


Hs7041 ESTs; Weakly similar to II ALU SUBFAMILY SQ WARNING ENTRY II [Ksapiens] 1747 


134898 


X98330 


Hs.90821 ryanodine receptor 2 (cardiac) 


1745 


132215 


T10132 


Hs.4236 KIAA0478gene product 


1744 


104229 


AB002346 


Hs.61289 synaptojanin2 


1743 


116166 


AA461556 


Hs202949 KIAA1 102 protein 


1743 


115433 


AA284252 


Hs.58372 ESTs 


1743 


114908 


AA236545 


Hs.54973 ESTs 


1742 


127425 


AA470941 


Hs.143162 ESTs 


1741 


131089 


Z38807 


Hs52870 ESTs 


1739 


113498 


T88908 


Hs.189746 ESTs 


1.738 


116710 


F10577 


Hs7Q312 ESTs 


1.735 


127210 


R51476 


yg76f04.fi Soares infant brain 1NIB Homo sapiens cDN A done 


1.733 


120554 


AA279654 


Hs.194524 ESTs 


1.733 


129940 


U18242 


Hs.13572 caldum modulating ligand 


1.732 


117023 


H88157 


Hs.41105 ESTs 


1.731 


111700 


R22212 


Hs.23361 ESTs 


1.731 


116911 


H72240 


Hs.39292 ESTs; Moderately similar to KIAA0745 protein [H^apiens] 


1.731 


106025 


AA412063 


Hs£Q65 ESTs 


1728 


108626 


AA101984 


Hs.61697 G-protein coupled receptor 


1.726 


111614 


R12581 


Hs.191146 ESTs 


1.726 


134134 


L76703 


Hs.173328 protein phosphatase 2; regulatory subunit B (B56); epsilon isoform 


1.725 


106886 


AA489086 


Hs.36545 ESTs 


1.725 


117998 


N52136 


Hs.93828 ESTs 


1.725 


121204 


AA400422 


Hs.55896 ESTs 


1.725 


121342 


AA404995 


Hs.192480 ESTs 


1.725 


131129 


R27296 


Hs.23240 ESTs 


1.725 


116235 


AA479181 


Hs.186726 ESTs 


1.725 


102423 


U44754 


Hs.179312 small nuclear RNA activating complex; polypeptide 1 ; 43kD 


1.724 


110273 


H29050 


Hs.24096 ESTs 


1.722 


108758 


AA127395 


HS222414 ESTs 


1.722 


110672 


H88477 


Hs.191178 ESTs 


1.721 
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120271 AA176404 Hs.1 1 1 092 ESTs; Weakly similar to ZINC FINGER PROTEIN 136 [H .sapiens] 172 

100227 D28915 Hs.82316 Interferon-induced; hepatitis C-assodated microtubular aggregate prot (44kO) 1 .719 

129232 W69459 Hs.109655 sex comb on midleg (Drosophila)-likG 1 1 .719 

134663 W73367 Hs.8750 ESTs 1.717 

1049Q2 AA055475 Hs.104143 dathrin; Oght polypeptide (Lea) 1.717 

120582 AA281290 Hs.125287 ESTs;Weakly similar to BC331191J [H.saplens] 1.717 

134891 F03517 Hs.90787 ESTs 1.716 

106219 AA428567 Hs.26613 Homo sapiens mRNA; cDNA DKFZp586F1 323 (from clone DKF2p586F1323) 1.715 

116372 AAS21311 Hs.13854 ESTs 1.713 

107570 AA001870 Hs^37323 N-acetylglucosarnine-phosphate mutase; DKFZP434B187 protein 1.713 

106198 AA427816 Hs.11803 ESTs 1.712 

125136 W31479 Hs.129051 ESTs 1.712 

104973 AAD85676 Hs.6763 KIAA0942 protein 1.712 

128710 J04813 Hs.1 041 17 cytochrome P45Q; subfamily IIIA (nip hedipine oxidase); polypeptide 5 1.711 

123994 D20899 Hs.107127 Homo sapiens mRNA; cDNA DKFZp564G022 (from done DKFZp564G022) 1 .71 1 

127871 AA766511 Hs.128848 ESTs 1.71 

116089 AA455933 Hs.41324 ESTs 1.709 

123337 AA504153 Hs.132797 ESTs; Weakly similar to ORF YGLOSOw [S.oerevisiae] 1.708 

123619 AA609200 Hs.162686 ESTs 1.708 

104781 AA026617 Hs21610 ESTs; Highly similar to BAIt-associated protein 1 [H.saplens] 1.707 

115114 AA256468 Hs.88146 ESTs 1.705 

117852 N49408 Hs.136102 KIAA0853 protein 1.705 

127644 T57570 Hs.77039 ribosomal protein S3A 1.704 

111359 N91273 Hs27179 ESTs 1.702 

131721 L36644 HiL31092 EphA5 1.7 

132438 F08925 Hs.48610 ESTs 1.7 
132476 N67192 Hs.49476 Homo sapiens done TUA8 Cri-du-chat region mRNA 1.7 
130990 F02488 Hs.21917 WAA0768 protein 1.7 
128499 AA487503 Hs.100636 ESTs 1.698 
120780 AA342337 Hs.241 569 ESTs; Modtly smlr to I! ALU SUBFAMILY SQ WARNING ENTRY 11 [H.sapiens] 1.697 
132920 L06133 Hs.606 ATPase; Cu++ transporting; alpha polypeptide (Menkes syndrome) 1.696 
135037 U77948 Hs.184122 general transcription factor II; i 1 .696 
110024 H11297 Hs.31050 ESTs 1.695 
134415 AA329274 Hs.82911 protein tyrosine phosphatase type IVA; member 2 1.694 
102223 U24685 Hs.148226 Human anti-B cell autoantfoody IgM heavy chain variable V-CKJ region (VH4) 

gene; done E1 1 ; VH4-63 non-productive rearrangement 1 .694 

126712 AA205862 Hs.7942 ESTs 1.694 

101507 M27492 Hs.82112 interieukin 1 receptor; type I 1.692 

106291 AA435551 Hs.30824 ESTs 1.691 

116826 H58691 Hs.8215 ESTs; Wealdy similar to double-stranded RNA-binding nudear 

protein DRSBP76 (H^apiens] 1 .69 

135339 D59269 Hs.127842 Homo sapiens mRNA full length insert cDNA done EUROiMAGE 783648 1.69 

1 18250 N62602 yz75b6.s1 Soares_muttipte_sderosis_2NbHMSP Homo sapiens cDNA done 

IMAGE288851 3* similar to contains Alu repetitive element;, mRNA sequence 1.689 

106470 AA450116 Hs.1 86180 ESTs 1.688 

108203 AA057678 Hs.63408 ESTs 1.687 

119748 W70313 Hs.126906 ESTs 1.686 

116576 D51228 Hs.79404 neuron-spedfic protein 1.683 

123035 AA481392 Hs.105166 ESTs 1.683 

126668 AA011616 Hs.184086 ESTs 1.681 

101512 M28209 Hs.250716 RAB1; member RAS oncogene family 1.678 

102704 U76638 Hs.54089 BRCA1 associated RING domain 1 1.677 

126218 AA256386 Hs.13649 Novel human gene mapping to chomosome 13; similarto rat RhoGAP 1.676 

111180 N67277 Hs.9403 ESTs 1.676 

105937 AA404342 Hs.173531 ESTs 1.675 

114118 Z38520 Hs.175930 ESTs 1.675 

109203 AA190634 Hs.108787 endoplasmic reticulum membrane protein 1.675 

125245 W86608 Hs.7243 ubtquEtin spedftc protease 24 1.675 

102906 X06956 Hs.75318 tubulin; alpha 1 (testis specific) 1.675 

125914 AA262925 Hs.1 80034 cleavage stimulation factor; 3* pre-RNA; subunit 3; 77kD 1.674 

134294 U63289 Hs.81248 CUG triplet repeat; RNA-binding protein 1 1.674 

109742 F10108 Hs.183333 ESTs 1.673 

134674 D63876 Hs.87726 KIAA01 54 protein 1.673 

104079 AA402937 Hs.103238 ESTs 1.671 

107554 AA001386 Hs59844 ESTs 1.671 

132439 AA243139 Hs.4863 Homo sapiens done 25088 mRNA sequence 1.669 
124515 N58172 Hs.109370 ESTs 1.668 
124300 H92575 Hs.105959 ESTs; Weakly similar to 1! ALU SUBFAMILY SQ WARNING ENTRY II [H.saplens] 1.668 
126809 AA743475 Hs.171693 ESTs 1.667 
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106095 AM19547 Hs.11713 ESTs 1.664 
101754 M77142 Hs.239489 T1A1 cytotoxic granule-associated RNA-binding protein 1.663 
105188 AA192306 Hs.23926 ESTs 1 663 

113582 T91371 Hs.16824 EST U61 
119559 W38197 Accession not listed in Genbank 1.661 

119961 W87535 Hs.59015 ring finger protein 9 1657 
123255 AM90890 Hs.105273 ESTs 1 657 

111078 N59230 Hs.186574 ESTs 1*655 
113082 T40528 Hs.8246 ESTs 1654 
119589 W44692 Hs.124177 ESTs 1^652 
104308 D53639 Hs.77904 rfoosomal protein S26 1.65 
103073 X59417 Hs.74077 proteasome (prosome; macropain) subunit; alpha type; 6 t65 
124424 N35314 Hs, 107265 ESTs 1.65 
128890 AA096157 Hs.182364 ESTs; Weakly similar to 25 kDa trypsin inhibitor [H.sapiens] 1.65 
1 19400 T92767 ye27d06.s1 Stratagene lung (#937210) Homo sapiens cONA done 

IMAGE:1 18955 3, mRNA sequence. 1.65 
131631 AA486868 H&298Q2 sfit (Drosophila) homolog 2 165 
118229 N62339 Hs. 180532 heat shock 90kD protein 1; alpha 1649 
118533 N67954 Hs.49413 ESTs 1648 
130666 AA476307 Hs.194035 KIAA0737 gene product 1.647 
103093 X60708 Hs,44926 dipeptidylpeptidase IV (CD26; adenosine deaminase compiexing protein 2) 1.647 
128667 U69140 Hs.103419 iasdt^tion and elongation protein zeta 2 (zyginli) 1.646 
112933 T15530 Hs221439 ESTs 1 646 

114546 AA056263 Hs.132747 ESTs 1.645 
1267(5 AA579377 Hs.1 80532 heat shock 90kD protein 1; alpha 1644 
114399 AA007595 Hs.220937 ESTs 1 642 

118836 N79820 Hs.50854 ESTs ^64 
100401 D85423 Homo sapiens mRNA for Cdc5, partial cds 1.64 

105681 AA284865 Hs.171228 KIAA1040 protein 1639 
132526 AM60128 Hs.5074 similar to S. pombedim1+ 1.639 
133809 AA034002 Hs.76359 cataiase 1639 

115968 AA447083 Hs.134522 ESTs 1.637 

116370 AA521256 Hs.236204 ESTs; Moderately similar to NUCLEAR PORE COMPLEX 

PROTEIN NUP107[R.norvegicus] 1.631 

109644 F04477 Hs2048Q2 ESTs; Moderately similar to GLYCERALDEHYDE 3-PHOSPHATE 

DEHYDROGENASE; UVER [H^apiens] 1.627 

103427 X97303 H.sapiens mRNA for Ptg-12 protein 1.627 

132186 T33888 Hs.221040 KIAA1 038 protein 1626 

131428 U17838 Hs.26719 PR domain containing 2; with ZNF domain 1.626 

126638 AA649257 Hs.188602 ESTs 1625 

114503 AA039568 Hs.188083 ESTs 1*625 

121242 AA400857 Hs.97509 EST 1.625 

122414 AA446885 Hs.99087 ESTs; Moderately similar to ZINC FINGER PROTEIN 141 [H^apiens] 1.625 

110632 H72344 Hs.171635 ESTs 1^24 

111369 N95837 Hs.169111 ESTs; Weakly similar to LB2A [Djnetenogaster] 1.624 

112449 R63802 Hs.124186 ring finger protein 2 1623 

113070 T33464 Hs.6298 ESTs 1.622 

107229 D59284 Hs.34644 ESTs 1.618 

132710 W93726 Hs.55279 protease inhibitor 5 (maspin) 1.617 

124664 N94814 Hs.33540 ESTs; Weakly similar to KIAA0765 protein [H^apiens] 1j617 

130166 AA350690 Hs.151411 KIAA0916 protein 1.616 

125040 T78451 Hs.199961 ESTs 1.615 

132972 H39627 Hs.1 64967 ESTs; Weakly similar to H ALU SUBFAMILY SB WARNING ENTRY ll [H^apiens] 1.615 

115873 AA433916 Hs.90093 heat shock70kD protein 4 1.611 

120408 AA235045 Hs.190151 ESTs 161 

120934 AA383773 Hs.191500 ESTs 1.61 

115259 AA279071 Hs.13453 splicing factor 3b; subunit 1;155kD 1.609 

134330 020113 Hs.8185 ESTs; Highly similar to CGI-44 protein (Ksaplens] 1.607 

115117 AA256492 Hs.49007 poiy(A) polymerase 1606 

125162 W44682 Hs.109896 ESTs 1 605 

103946 AA285246 Hs.111650 ESTs; Weakly similar to Prt1 homolog [H^apiens] 1.604 

133389 AA166917 Hs.72639 ESTs ** 1.603 

1 15528 AA342301 Hs.53929 ESTs; Weakly similar to II ALU CLASS 6 WARNING ENTRY II [Ksapiens] 1.602 

129704 W81301 Hs.12064 ublquitin specific protease 22 1.602 

109313 AA206800 Hs.86276 ESTs; Moderately similar to zinc finger protein dp [Usapiens] 1.601 

130457 U58091 Hs.155976 culiin 4B ~ 1.6 

123076 AA485211 Hs.190046 ESTs 1.6 

115113 AA256460 Hs.44810 ESTs 1.6 

117731 N46433 Hs.46609 ESTs 1.6 
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123344 AA504338 Hs.171857 ESTs 1599 

131798 X86098 Hs.3238 adenovirus 5 E1A binding protein 1597 

125370 AA256743 Hs.151791 KIAA0092 gene product 1598 

114918 AA236813 Hs.72324 ESTs; Highly similar to unknown [Rsapiens] 1.596 

5 114807 AA160805 Hs.199832 ESTs 1596 

105103 AA151593 Hs.10130 ESTs 1594 

125004 T60120 yb68!02.s1 Slralagene ovary (#937217) Homo sapiens cONA clone 

IMAGE.76347 3\ mRNA sequence. 1592 

105658 AA282914 Hs.10176 ESTs 1589 

10 110455 H52172 yt85e8^1 Soaresj)ineaLgland_N3HPG Homo sapiens cDNA clone 

IMAGE231 1 1 3' similar to contains Mu repetitive element, mRNA sequence 1 589 

119780 W72967 Hs.191381 ESTs; Weakly similar to hypothetical protein [H.sapiens] 1587 

126983 AA211537 zn55d01.rl Stratagene muscle 937209 Homo sapiens cONA clone 

IMAGE562081 5', mRNA sequence. 1 586 

15 134675 AA250745 Hs57773 protein kinase; cAMP-dependent; catalytic; beta 1.584 

105431 AA252033 Hs.15036 ESTs; Weakly similar to !1 ALU SUBFAMILY J WARNING ENTRY U [Haptens] 1584 

120187 Z40251 Hs56974 ESTs 1584 

115830 AA428137 Hs.86434 ESTs 1581 

135069 AA456311 Hs.93961 ESTs; Weakly similar to I! ALU CLASS A WARNING ENTRY !! [H^aplens] 1.581 

20 122997 AA479295 Hs.106290 Keteh motif containing protein 1581 

119707 W67569 Hs.44143 ESTs; Weakly similar to SNF2alpha protein [H^apiens] 1.58 

131934 D80948 Hs.34922 ESTs 158 

106141 AA424558 Hs.9302 phosducMke 158 

115271 AA278422 Hs5724 ESTs 1.579 

25 131468 R27598 Hs27197 KIAA0797 protein 1577 

131165 R98173 Hs23763 Max-interacting protein 1575 

117273 N21680 Hs.43047 ESTs 1575 

101569 M33772 Hs,182421 troponin C2; fast 1575 

116127 AA459703 Hs.79070 v-myc avian myelocytomatosis viral oncogene homolog 1575 

30 120022 W90625 Hs58432 ESTs 1575 

117512 N32157 Hs.82207 ESTs 1574 

106511 AA452865 Hs.206713 UDP-GafcbetaGlcNAc beta 1;4-gaiactosyitransferase; polypeptide 2 1573 

116415 AA609204 Hs.27973 K1AA0874 protein 1573 

127879 AA810215 Hs.189079 ESTs 1571 

35 125211 W72798 Hs.103177 ESTs; Wkfy smlr to cDNA EST EMBLD32579 comes from this gene [Celegans] 1571 

114746 AA135638 Hs.223756 ESTs 1.571 

122698 AA456112 Hs.99410 ESTs 1.57 

116765 H12636 Hs.121585 ESTs; WeaWy similar to reverse transcriptase [H^apiens] 1.568 

130895 AA609828 Hs.21015 ESTs; Highly similar to tetracycfine transporter-like protein [M.musculus] 1568 

40 114338 Z41366 Hs.40109 K1AA0872 protein 1567 

111005 N53076 Hs5996 ESTs 1567 

128135 AA913491 Hs.189143 ESTs; Modrtiy smlr to 11 ALU SUBFAMILY J WARNING ENTRY U [H^apiens] 1567 

112046 R43365 Hs.22273 ESTs 1566 

132160 AA261770 Hs.184081 seven in absentia (Drosophila) homolog 1 1566 

45 111568 R10153 Hs£0561 ESTs 1566 

127775 H04106 Hs.179902 ESTs; Weakly similar to NG22 [H^apiens] 1.566 

115359 AA281936 Hs.88914 ESTs 1.566 

121845 AA425734 Hs.165066 ESTs; Weakly similar to hypothetical protein [H.sapiens] 1565 

127854 AA769520 ESTs; Weakly similar to REGULATOR OF MITOTIC SPINDLE 

50 ASSEMBLY 1 (Haptens] 1564 

120287 AA187679 Hs.111114 ESTs 1563 

114940 AA243012 Hs.75928 ESTs 1562 

126716 AA031700 Hs.251962 ESTs " 1562 

134161 U97188 Hs.79440 IGF-ll mRNA-binding protein 3 1561 

55 125390 H95094 Hs.75187 translocase of outer mitochondrial membrane 20 (yeast) homolog 1561 

115334 AA281244 Hs,65300 ESTs 1559 

113721 T97931 Hs.18190 EST 1558 

114895 AA236177 Hs.76591 KIAA0887 protein 1558 

119341 T62571 Hs.146388 microtubule-associated protein 7 1558 

60 108012 AA039616 Hs.61933 ESTs 1.558 

130335 AA1 58499 Hs.8454 protein kinase; cAMP-dependent; regulatory; type II; alpha 1557 

134351 R82074 Hs.82109 syndecanl 1557 

133300 D51401 Hs.70333 ESTs 1553 

106920 AA490899 Hs24462 ESTs 1553 

65 118744 N74075 Hs.94293 EST 1552 

126489 W20016 Hs.144228 ESTs; Weakly similar to ZINC FINGER PROTEIN 63 [H^apiens] 155 

115913 AA438720 Hs55487 ESTs 155 

107868 AA025234 Hs.61260 ESTs 155 

134520 N21407 Hs.257325 ESTs 155 
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109703 F09684 Hs.24792 ESTs; Weakly similar to ORF YOR283W [S.cerevisiae] 1.55 

120288 AA187938 Hs35189 ESTs; Weakly similar to F25B5.3 [C.elegans] 1.548 

106356 AA443277 Hs.31034 peroxisomal biogenesis factor 11 A 1348 

129460 AA235627 Hs.11171 APG5(autophagy5;S.cerevfelaeHike 1347 

133950 011961 Hs.77823 ESTs 1.546 

128172 AI400862 Hs.142607 ESTs 1346 

114162 Z38909 Hs.22265 ESTs 1345 

101803 M86546 Hs.155691 pre-B-cell leukemia transcription factor 1 1344 

113617 T93630 Hs.17207 ESTs 1342 

104896 AA054228 Hs.23165 ESTs 1341 

114477 AA032013 Hs.144260 EST 1.54 

110731 H98653 Hs.188006 KIAA0878 protein 1.54 

130367 Z38501 Hs.8768 ESTs; Wkly smlr to II ALU SUBFAMILY SQ WARNING ENTRY !! [H^aptens] 1338 

130539 L07044 Hs.250857 Homo sapiens caldum/calrrxxJuM^ 1338 

134921 W60186 Hs.169487 Kreister (mouse) maf-related leucine zipper homolog 1337 

130583 W24957 Hs.16281 ESTs; Moderately similar to similar to Celegans protein 

encoded in (»sfnidT20D3(H^apiens] 1337 

133723 AA088851 Hs.75744 S-adenosylmethionine decarboxylase 1 1.537 

106450 AA449469 Hs.11859 ESTs 1336 

104120 AA429838 Hs.89519 KIAA1 046 protein 1336 

100533 HG1879-HT1919 Ras-Like Protein Tc10 1335 

130664 R09049 Hs.17625 ESTs 1335 

127122 AA279153 Hs.1 90049 ESTs 1335 

134264 T03391 Hs.8087 ESTs 1335 

132319 AA418662 Hs.44625 ESTs 1335 

115465 AA286941 Hs.43691 ESTs 1.533 

125003 T59442 Hs.100445 ESTs 1332 

102273 U30888 Hs.75981 ubiquitin specific protease 14 (tRNA-guanine transglycosytase) 1332 

121875 AA426299 H&98510 ESTs 1332 

114366 Z41747 Hs.469 succinate dehydrogenase complex; subuntt A; flavoprotein (Fp) 1331 

132944 AA054515 Hs.6127 ESTs; Weakly similar to prostate-specific transglutaminase [H^apiens] 133 

111199 N68210 Hs.29822 ESTs 133 

113494 T88878 Hs£58738 ESTs 1329 

129515 AA490882 Hs.1 12227 ESTs 1328 

133124 AA156049 Hs.65490 ESTs 1328 

104785 AA027163 Hs.7942 ESTs 1326 

105595 AA279408 H&25866 ESTs 1326 

130198 U67156 H&151 988 mitogen-activated protein kinase kinase kinase 5 1326 

114297 Z40758 H&173091 DKFZP434K151 protein 1325 

112876 T03488 Hs.4842 ESTs 1325 

127500 AA525014 Hs.162115 ESTs 1325 

120519 AA258585 Hs.1 29887 cadherin 19 (NOTE: redefinition of symbol) 1 325 

119859 W80702 Hs.58461 ESTs 1325 

129944 L00389 Hs.1 361 cytochrome P450; subfamily I (aromatic compound-inducible); polypeptide 2 1324 

118864 N89670 Hs.42148 ESTs; Weakly similar to Su(P) [D.melanogaster] 1323 

123964 C13961 Hs.210115 EST 1323 

111676 R19414 Hs.166459 ESTs 1322 

128332 A1079523 Hs.134173 ESTs 1322 

130455 X17059 Hs.155956 N-acetyltransferase 1 (arylarmne N-acetyttransferase) 1.521 

125181 W58461 Hs.12396 ESTs 1321 
127093 AA768241 oa72d0£s1 NCI_CGAP_GCB1 Homo sapiens cDNA clone 

IMAGE:1317795 3', mRNA sequence. 1321 

132156 AA157401 Hs.4113 S-adenosyihomocysteine hydrolase-tike 1 * 1321 

125303 Z39821 Hs.107295 ESTs 1.52 

132697 AA281951 Hs3518 Homo sapiens mRNA; cDNA DKFZp566J2146 (from done DKFZp566J2146) 132 

117086 H93135 H&41840 ESTs 1319 

113355 T79203 Hs.14480 ESTs 1318 

108621 AA101811 Hs.69506 ESTs 1.518 

109384 AA219172 Hs.86849 EST 1318 

128510 X94703 Hs.1 00816 RAB28; member RAS oncogene family 1317 

132968 N77151 Hs.61638 myosin X 1315 

117035 H88798 Hs.41182 ESTs 1315 

116781 H22985 HS32132 ESTs 1313 

108677 AA115629 Hs.118531 ESTs 1313 

130214 H78003 Hs.15266 ESTs 1313 

134700 AA481414 Hs.8868 golgi SNAP receptor complex member 1 1312 

116618 D80783 Hs.45224 ESTs 1308 

126257 N99638 tumor necrosis factor receptor superfamlly; member 10b 1.508 

125859 AA806808 Hs.1 18797 ubiqurtin-conjugating enzyme E2D 3 (homologous to yeast UBC4/5) 1.508 
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113837 W57698 Hs.8888 ESTs 1507 

114317 Z41038 Hs.469 succinate dehydrogenase complex; subunit A; Apoprotein (Fp) 1507 

100311 D5064O Hs.184653 phosphodiesterase 3B; cGMP-inhibited 1507 

126802 AA947601 Hs.97056 ESTs 1506 

128661 R82837 Hs.103329 KIAA0970 protein 1506 

134194 AA233231 Hs.79828 ESTs 1.506 

108953 AA149652 Hs.42128 ESTs 1504 

133240 D31161 Hs.68613 ESTs 1502 

132671 X76302 Hs54649 putative nucleic acid binding protein RY-t 1501 

132609 Z48923 Hs53250 bone morphogenetic protein receptor; type II (serine/threonine kinase) 1.501 

105574 AA278678 Hs.258567 ESTs 15 

113718 T97782 Hs.256268 ESTs 15 

127824 AI208365 Hs.127811 ESTs 15 

130132 U55936 Hs.184376 synaptcsomal-associated protein; 23kD 1.5 
127394 AA453224 ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY !! [lisapiens] 15 

100485 HG1111-HT1111 Ras-Uke Protein Tc21 15 

101078 L04510 Hs.792 ADP-ribosylation factor domain protein 1;64kD 15 

128611 AA456845 Hs.102471 KIAA0680 gene product 15 
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TABLE 12A shows the accession numbers for those primekeys lacking unigeneBD's for 
Table 12. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
5 Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 
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Pkey. Unique Eos probeset identifier number 

CAT number. Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



108536 119811J 
117040 46956 J 
100782 18457 J 



100819 3022J 



100824 5_36 



125004 264197J 
102313 27608J 
102337 553_1 



124704 



124825 
110455 
126257 
125S24 
104038 
103427 



292319J 

185904J 

330773J 

46874J 

182217 1 

154135J 

264235J 

43892.1 



AA084524 AA339253 AW966289 

AW970600 AA503323 H89218 AF086D31 H89 1 12 

AA355435 NM_001516 Z30093 T28405 AW949486 AA461142 AA410532 AI652073 AA521208 AI970141 AI968234 AI026102 
AA713583 AW135876 AA936614 AA770300 AI242635 AA377033 AW960263 AW607683 AI273803 AA41Q287 A1040513 
AA460838 AI80391 6 AW294095 AW449680 AW798677 AW675048 BE5421 16 AL120521 

L34840 NM_003241 U31905 A1546931 AI791616 AI973065 AI792321 AI546937 AI685880 AI732835 AI682360 AA420653 
AA564047 A1682323 AI824614 A1659889 AI680052 AI970887 AI623108 AA420692 A1418074 AA631018 AI810595 AW291463 
AW449930 AI668908 AI970818 

AI393237 A!521317 AI761348 AF025841 D43968 AW994987 L34598 AF025841 D89789 D89788 D89790 AW998932 
AI971742 AI310238 X90976 AW139668 AW674280 AI365552 AA877452 AV657554 C75229 AA376077 AI798056 AW609213 
W25586 H30149 BE075089 BE075190 AW580858 H99598 AA425238 AA133916 AW363478 BE158121 BE158127 
AW467960 BE158135 BE158126 BE158145 N92860 AA847246 AI961688 A1361423 AA878154 AA043767 AI863712 
A1559226 AW339007 A1371266 AI368901 AA046624 AA134739 AW449154 AA13Q232 AI458720 AA96251 1 AI700627 
R70437 AW004008 AA045229 A1671572 H99599 AA043768 AI685454 A1871685 N29937 X90977 AA524240 AI142114 
AI825750 AI567805 AI631365 AI347893 AA134740 F20669 AA046707 AW79321 6 AW963298 AW959380 AA363265 
AI784593 AI268201 R69451 AV657618 AI695588 

BE312163 AJ230798 AA374482 AI926059 AA622653 AI860704 BE139185 AW296884 T60238 T60120 
U33921 AI190489AA573311 

AI814663 AA806761 AA765241 AA019317 AA092255 AA035405 T85079 AA890151 AI373959 T85080 BE153728 AA740848 
BE080682 AL048137 AW182316 A1699468 AW274481 AW407538 AA306562 AW950024 AW949943 AL045703 AW843196 
W25132 BE612794 AA304266 AW958054 H25673 AV646563 AV646573 BE172990 AW593488 AA385181 AA164998 
A1246476 AA345406 AI277554 M134749 AA856624 BE613247 AA299003 AL048 138 AA028121 T92510 AJ923835 
AW020440 AI401594 AJ889401 N93290 AA044247 AA028100 AI582845AA811151 A1741811 AI925878 AA448277 AA172221 
AI214783 BE220793 AA022746 AI082882 AA022849 AI928385 AA573472 AI420686 AW0729Q2 AI799493 A1873506 
Ai468977AI192079AM68976AA044272 AW015701 AW316979AA933042 AA609017AI318393AI424571 AI934945 
AA172023 AW050917 AA846180 AA134748 AI003947 AI766769 AW006697 AAS53517 AW575680 A1474214 AA401478 
U36922 AA927064 AA868000 062654 T91745 AW500202 M1 94764 AA746346 AA1 30464 AW1 17498 AA054526 N26432 
H02534 H04964 AW303367 BE300931 AI21 8049 AI208073 AW182749 AA983630 A! 147585 AA194765 AA054534 AA922720 
A1436585 AI346535 AA134269 AA280923 AA897422 AA019559 AW274010 AA035406 AA917879 H99327 W32908 AI216046 
AW496823AA019414H82288W35284 AI936621 AI767113AA866177AW367874H82398 AF032885 AW300151 AW467069 
AA809346 AI188507 AI494178 AA872752 AI631631 U02310 NM_002015 AA815006 A1382453 AW197658 AI761654 
AI8QA39S AI382221 AI813640AI439635AI523901 AW517242 AI221705 AW298104AW204560AW573095 AWQ28783 
AW014650 AI766744 AI808294 AI698758 AI041809 AI766667 AI479103 AA872797 AA769305 AA765080 AA334166 
A1472322 
R07335R07640 

AW953679 AW953680 AA244436 H82527 AA361046 AA244483 H82526 

AA501669R52088 

H52576 AF085971 H52172 

N99638 AW973750 AA328271 H90994 AA558020 AA234435 N59599 R94815 
AW968363 AA465492 R34539 AA16541 1 
AA374532 AA421255 

BE514383 AA071273 AW247987 AW673286 BE312102 AW749824 BE071985 AW577383 BE071945 BE072005 AW577355 
BE071965 AW239231 BE072000 BE071960 AW577360 AW749830 AW373020 X97303 AW999522 BE0001 92 BE562219 



104142 113242 J 
127093 47721 1 



AA074713AA447006 

AW977549 AA256038 AL365415 AW500455 AA768241 AW968097 217849 AA256104 
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125873 10492J 
125954 4457J 



125992 1589048J 
127210 15307_6 



127263 232161J 
135197 29440 J 



127394 
126879 
126983 
120470 
127854 
121367 
106320 



304844J 

1860_2 

171841J 

188975J 

443883J 

280429J 

6435J 



115479 201515J 
101026 11075J 

35 100401 24827J 



40 



130542 28089.3 



100485 30576_2 

108345 112277^6 
100522 19669J 

100533 32905J 

100598 23902_2 



102332 14745.3 
118250 genbank__N62602 
103678 entre^Z84483 
119400 genbankJ92767 
119559 entre^W38197 



AW271 838 AL133605 C01646 H29959 AA999896 D60676 AW999454 AW961 176 AA315244 H14437 AW3861 18 N46512 
AW272Q21 AI76851 6 BE466421 A1082809 AI804454 AA905101 AW1 73368 N38942 AW614169 AI080483 N29489 AI500550 
AA994475 AA614464 AA707368 AA593145 AA569473 AW627815 AI828244 N63226 N42300 

NM.016353 AB023584 W44753 R09585 AA382665 R23772 AI814257 AA974046 AK001608 A1935638 AW440609 AI420022 
AA777386 AA806969 A1554676 A1584006 AI688556 A1688634 AI697997 AI014540 AI806683 AI741 202 AW263154 
AW297238 AI149951 AI589076 AW082158 AW614265 AA931887 M781969 R09490 AA484643 AI207121 AI088390 
A1538065 AI619547 AI741 925 AI702846 H40846 R93943 AW747979 AA461348 U30163 AA326023 AI535992 AW242870 
AI244025 AI222558 W38425 AW473630 A1624599 AI921226 AI683152 AI096458 AI123822 AW170802 C16447 AI337674 
D25726 AW339366 AW771259 AA461174 
H48372W01626 
AA305278 AA223833 

110924 6443J AW058463AF1 95766 AA6801 45 T86901 W60373W60281 NMJX)7222 AF1 06862 AI000795AA1 671 88 
AW884503 AW891313 AW891332 AW891312 AI984924 A1123518 N75170 AA131614 H25330 AI913358 A1742277 W25576 
R58771 AW445159 AW888628 AW888627 AW274674 AI088482 N52314 N34282 AW001769 AI338943 T66784 AI288963 
AW468676 AW237528 H25289 N71690 AA610128 AI143458 AI082599 N49144 AA854773 AW663411 AW610151 N47938 
AW601626 AA1 67189 AA918304 AA805205 BE069496 AA652836 BE069499 A1699298 AW249926 AW888578 BE567635 
T10726 AW604715 D54245 D53062 D55610 D55555 AA301376 AI133498 N77788 AI936320 AW090734 A1269977 N50828 
AA550814 AI421993 AI005384 N50813 D60292 D59349 AA131710D81698 D81699 
AA331 156 AA331 157 AA331 155 

U76456 NM_003256 AF057532 AA193414 AW293304 AW963378 AA313095 AI359841 AI969312 AI080163 AW448926 

AI671 136 BE466399 AI637967 AI671873 AW196583 AW071635 AI634427 AW296872 AW292470 AA193650 

BE161832 AA453224 AA485772 

D90391 M55575 AI652268 AA719776 

AA524886 AW971347 AA211537 

AW971327 AA524988 AW628653 AA251797 

AW976796AA769520 

AA432071 AA405648 AW000908 T16347 

AB028957AL120001 AI267678H1 0928 R1 9844 AW970334 AA393182 F05472 F1 1711 H09908 N50250 AI815411 BE463679 
D61468AW970253D60889C15548D61011 D60867 AI8 15795 AA534831 D81386 AW235039 A1382158 D81174 AA416899 
AAB52310 H09789 H10929 H09813 F09369 R44721 D51515 Z38456 R14004 T66255 F12148 F12139 AW351702 M85350 
AI018713 AW972450 AW972645 AA514964 T66172 F09785 F09776 AA436608 T05327 T07118 AA339352 
AW301608 N46706 AA649093 AA287595 AW81 1753 AA287596 N39260 

NMJJ01874 J04970 T91426 AW205201 T84979 AA255727 AA847837 R02164 T91339 AV651884 AV651835 AV651350 
AV6501 18 AV651338 A1272002 A1367796 AA830651 AA2621 12 AW151 198 

AU076696 AA219720 AL135197 AA305877 N56376 AA318063 AA130725 AW954903 BE541230 AW383312 U86753 085423 
AI679458 AI122932 AB007892 A1583919 BE160134 F08104 R34903 F13440 AA095444 AA262453 AA191036 R17895 
T81266 BE149776 A1279537 A11431 13 AA361072 AW959030 AW268817 AA81 1533 BE275179 AI221677 T65147 R49293 
AA249176 BE000290 AA768053 F09494 BE092645 BE172099 Z41177 AA044750 AI909768 BE140795 BE140574 AW845210 
AW752452 BE243244 AA843664 AI300080 BE169032 AW189979 BE004869 AA621872 AI951772 AI678897 AI926598 
N62813 AI350912 AW608791 AI309602 AI983138 AW875592 AI655073 AW875626 AA130606 AI370827 C75528 C75554 
AW263335 AI344426 BE004788 AA576220 AA604824 AI431405 AA749378 R38882 AW955075 AA173821 C75657 
AA219672 AW768408 R43141 AI431414 AA483343 A1673792 T17294 AW770187 N74285 AI476404 AI088288 AA654152 
AW974864 BE617311 BE243328 BE168049 

U64675 AW1 67507 AW167508 BE218568 AA779360 W85722 AL044843 BE159404 AF012086 AW89861 1 AW898610 
BE159405 BE092191 AW890826 AW369841 AW368064 AW606702 AL044731 R82691 AA419346 AA416558 H96045 
AL040450 AI640531 AJ808434 AL04661 3 AW855784 AW362469 AL048881 AL049015 AA094272 AA8889Q8 M417294 
AW237786 R59793 AL044916 D82402 AI216854 AI079342 H96406 AL037845 AI915900 AA972133 A1478783 T31074 
221135 Z21396 AA352182 R13918 AA430178 C17811 A1371824 AI742256 AA926801 N79156 AA350610AA081971 N83639 
R35544 AA312292 AW952080 N42322 M171957 AA565297 R89207 AA504106 AI630782 AA826482 AI301579 T36241 
AW966618 Z28426 AL043480 AI124636 AA393449 T19504 AW887823 A1289814 N53979 AL043571 A1632764 AI859613 
AI986308 A1683212 AI984499 AI133258 C05898 AW512761 A1041260 BE466240 Z19161 AI351 190 N67549 A1373374 
AA400873 AW440914 AW514879 M770146 A1358754 R51 1 13 AI283773 AA649886 T30543 054358 R37750 T03358 
T15451 T15880 AA999689 N67396 AI056289 T85597 N62441 R89099 R00035 T85596 R61335 R00128 N63359 AI535964 
A1207768 M31468 NM_012250 W01322 AA253280 AA253233 AA293148 AW582106 R79880 AA459547 AA363459 
AA234396 N31669 H44468 AA434587 AW363088 AW993541 
AA070906AA070934 

X51501 NMJM2652 Y10179 J03460 A1791618 AI821473 AA916588 AA564296 AA9161 10 AI972286 AI420470 AI568790 
AI597724 AW205207 AI659305 AI791620 AA532383 AI821475 AA526498 

NM_012249 M31470 AL043108 AA262561 AA178883 T29433 AA313329 W48807 AW404323 AA453560 AW403227 H94816 
W17101 AA165152 W23989 AA091310 

AL121734 D54896 AA424269 BE242906 AA362118 BE018454 AI280348 AL048769 M35543 AA757734 AI128865 H20289 

H23728A1203445 H41481 H18237 H44081 H92839 AI928621 H75675D51 148 A1796198AW390453 055579 054145 D53996 

054015 R37664 H17541 AA668681 T65061 R15867 AW468123 R16049 H69030 AA054226 H16070 F09655 R92144 T03521 

R05473 H92840 AA018186 R91707 

U35637AA112989Z19308 

N626Q2 

Z84483 

T92767 

W38197 
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TABLE 13: shows genes, including expression sequence tags, up-regulated in prostate tumor 
tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 GeneChip 
array. Shown are the ratios of "average" normal prostate to "average" prostate cancer tissues. 
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Pkey: 


Unique Eos probeset identifier number 




ExAccn: 


Exemplar Accession number, Genbank accession number 




UnigenelD: 


Unigene number 




Unigene Title: 


Unigene gene title 




R1: 


Background.subtracted normal prostate : prostate tumor tissue 




Pkey ExAccn 


UnigenelD Unigene Title 


R1 


333516 


CH22_FGENES.173_1 


0.028 


337954 


CH22_EM:AC0Q55Q0.GENSCAN.96-3 


0.029 


332496 R73299 


Hs.204354 ras homolog gene family; member B 


0.03 


337944 


CH22_EMJ\C005500.GENSCAN.89-7 


0.033 


334111 


CH22.FGENES.330J0 


0.033 


333657 


CH2£J=GENES.241 2 


0.034 * 


327718 


CH.04Jisgi|6525284 


0.Q34 


336355 


CH22.FGENES.817J5 


0.035 


322011 AL137354 


EST duster (not in UniGene) 


0.Q35 


336377 


CH2a.FGENES.821 5 


0.036 


300254 AW079607 


Hs.188417 ESTs;WeaWy similar to ZnT-3 [Ksapiens] 


0.037 


330096 


CH.19_p2gi|6015278 


0.037 


335191 


CH22J=GENES.507_6 


0.038 


334040 


CH22_FGENES.322_8 


0.039 


333586 


CH22J=GENES.204_2 


0.04 


333295 


CH22_FGENES.132J2 


0.042 


313326 AI088120 


Hs.122329 ESTs 


0.043 


329517 


CH.10_p2gi|3983513 


0.043 


333403 


CH22_FGENES.144_21 


0.043 


335226 


CH22_FGENES513J1 


0.044 


335976 


CH22_FGENES.652_11 


0.045 


333637 


CH22_FGENES.229_2 


0.046 


334582 


CH22_FGENES.407_5 


0.046 


336437 


CH22_FGENES.826_4 


0.047 


337461 


CH22_FGENES.782-1 


0.047 


302892 N58545 


Hs.6975 histone deacetylase 3 


0.049 


338689 


CH22_EM:AC005500.GENSCAN.475-3 


0.049 


334721 


CH22_FGENES.421_32 


0.049 


305867 AA864572 


EST singleton (not in UniGene) with exon hit 


0.049 


335498 


CH22_FGENES.571_7 


0.05 


311596 AI682088 


Hs.223368 ESTs 


0.05 


326959 


CH.21_hsgI|6469836 


0.051 


01 lOoo AWUZ3001 






317298 AI922374 


Hs.158549 ESTs 


0.052 


332984 


CH22_FGENES54_6 


0.052 


321039 AW247083 


EST cluster (not in UniGene) 


0.053 


335844 


CH22_FGENES.623_4 


0.053 


325371 


CH.12_hsgi|5866920 


0.054 


335667 


CH22_FGENES.590J8 


0.054 


333635 


CH22_FGENES.228_2 


0.054 


336736 


CH22J=GENES.110-2 


0.055 


335893 


CH22.FGENES.635J 


0.055 


333170 


CH22.FGENES.94_5 


0.055 


329768 


CH.14_p2gi|6015501 


0.055 


334030 


CH22J=GENES.320_2 


0.055 


323359 AA234172 


Hs.137418 ESTs 


0.055 


300453 AW051431 


Hs.1 13029 rfoosomal protein S25 


0.055 


334262 


CH22J=GENES.367J2 


0.055 


306590 A1000246 


EST singleton (not in UniGene) with exon hit 


0.055 


331087 R22520 


Hs.23398 ESTs 


0.055 


338620 


CH2a_EMAC005500.GENSCAN.450-18 


0.056 


339045 


CH22J)A59Hie.GENSCAN.28-5 


0.056 


308023 AI452732 


EST singleton (not In UniGene) with exon hit 


0.057 
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339067 




CH22_0A59H18.GENSCAN.33-3 


0.057 


335689 




CH22_FGENES.596 4 


0.057 


339069 




CH22 DA59H18.GENSCAN.33-5 


0.057 


338176 




CH22»EMAC005500.GENSCAN^m 


0.057 


328159 




CH.06 hsgi|5868065 


0.058 


335655 




CH22 FGENES.590 6 


0.058 


336371 




CH22 FGENES.820 1 


0.058 


336558 




CH22_FGENES.842 3 


0.059 


337738 




CH22 EMAC000097.GENSCAN.1004 


0.059 


334273 




CH22_FGENES.369_2 


0.059 


335889 




CH22_FGENES.633 3 


0.059 


327807 




CH.05_hsgi|5867968 


0.059 


333315 




CH22 FGENES.138 7 


0.059 


338825 




CH22_0246D7.GENSCAN.4-6 


0.06 


337612 




CH22_C20H12.GENSCAN.22-5 


0.06 


333897 




CH22.FGENES.293 4 


0.06 


335990 




CH22_FGENES.655_4 


0.06 


334264 




CH22_FGENES.367_15 


0.06 


338653 




CH22_EM^C005500.GENSCAN.460-39 


0.061 


322303 W07459 




EST cluster (not in UniGene) 


0.061 


333498 




CH22_FGENES.168 8 


0.061 


336522 




CH22 FGENES.839 3 


0.061 _, 


301357 AW295677 


Hs.137840 ESTs; Moderately similar to H0ME0B0X 








PROTEIN SIX1 [H^apiens] 


0.062 


305917 AA876469 


Hs.181357 laminin receptor 1 (67kD; ribosomal protein SA) 


0.062 


336143 




CH22_FGENES.705 5 


0.063 


333493 




CH22_FGENES.168_2 


0.063 


332533 M99487 


Hs.1915 


folate hydrolase (prostate-specific membrane antigen) 1 


0.063 


325844 




CH.16_hsgi|6552453 


0.063 


336402 




CH22 FGENES.823 17 


0.063 


335767 




CH22_FGENES.607_1 


0.064 


301893 T80334 




EST cluster (not in UniGene) with exon hit 


0.064 


324019 AW1 77009 

Wt^V V W «1 •fill WW 




EST cluster (not in UniGene) 


0.064 


305801 AA845997 




EST singleton (not in UniGene) with exon hit 


0.064 


335188 




CH22 FGENES.507 3 


0.065 


337533 




CH22.FGENES.828-2 


0.065 


333311 




CH22 FGENES.138 3 


0.065 


335668 




CH22_FGENES.590J9 


0.065 


306786 AI041589 




EST singleton (not in UniGene) with exon hit 


0.066 


306365 AA962086 




EST singleton (not in UniGene) with exon hit 


0.066 


306249 AA933840 




EST singleton (not in UniGene) with exon hit 


0.066 


335018 




CH22_FGENES.474_6 


0066 


333594 




CH22_FGENES.210_3 


0.066 


333900 




CH22_FGENES.293_7 


0.066 


325207 




CH.10Jisgi|6552430 


0.067 


329888 




CH.15_p2gi|6067149 


0.067 


326238 




CH.17_hsgi|5867260 


o!()67 


333658 




CH22_FGENES241_4 


0.067 


335809 




CH22J=GENES.617_6 


0.068 


307427 A1243437 




EST singleton (not in UniGene) with exon hit 


0.068 


318428 AI949409 


H;l224583 ESTs 


0.069 


327005 




CH.21Jisgi|5867664 


0.069 


330463 HG998-HT998 




Sulfotransferase, Phenol-Preferring 


0.069 


333318 




CH22_FGENES.138J0 


0.07 


333313 




CH22J=GENES.138 5 


0.07 


325937 




CH.16 hsgi|5867132 


0.07 


335663 




CH22_FGENES.590J4 


0.07 


335349 




CH22_FGENES.539_2 


0.07 


303396 AA224470 


Hs.25426 


ESTs; Weakly similar to unknown [H^apiens] 


0.07 


332603 N66681 


Hs.33470 


ESTs 


0.07 


333310 




CH22_FGENES.138_2 


0.071 


309924 AW340812 




EST singleton (not in UniGene) with exon hit 


0.071 


336340 




CH22„FGENES.814_15 


0.071 


308025 A1453365 


Hs. 172928 collagen; type I; alpha 1 


0.071 


306805 AI055968 




EST singleton (not in UniGene) with exon hit 


0.071 


335499 




CH22_FGENES.571 8 


0.071 


329669 




CH.14_p2gi[6272129 


0.071 


321666 D28390 




EST cluster (not in UniGene) 


0.071 


338174 




CH22_EM:AC0Q5500.GENSCAN.219-2 


0.072 
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336556 CH22J=GENES.842J 0.072 

305451 AA738105 Hs.140 Immunoglobulin gamma 3 (Gm marker) 0.072 

336684 CH22_FGENES.46-1 0.072 

326943 CH.21Jisgi|6004446 0.073 

5 333947 CH22_FGENES.303J 0.074 

333214 CH22_FGENES.104_5 0.074 

331917 AA446572 Hs.174007 ESTs; Moderately similar to !!!! ALU SUBFAMILY J WARNING 0.074 

339102 CH22JDA59H18.GENSCAN.44-9 0.074 

328122 CH.06_hsgi|5868Q31 0.075 

10 332250 N62712 Hs.226223 KIAA0616 gene product 0.075 

328506 CH.07_hsgi|5868471 0.075 

331756 AA291468 Hs.98504 ESTs 0.075 

335193 CH22_FGENES.507 8 0.076 

317729 AA971718 Hs.128141 ESTs 0.076 

15 304515 AA458708 Hs.251 577 hemoglobin; alpha 2 0.076 

313644 AI565766 Hs.124960 ESTs 0.076 

326145 CH.17_hsgi|5867204 0.076 

336394 CH22J=GENES.823_6 0.077 

306516 AA989542 EST singleton (not in UniGene) with exon hit 0.077 

20 300629 AA152119 Hs.155101 ATP synthase; Hi- transporting; mitochondrial F1 complex; alpha subuntt; 

isoform 1 ; cardiac muscle 0.077 

333160 CH22_FGENES.91 2 0.077 . 

337490 CH22_FGENES.799-5 0.077 ~" 

305403 AA723748 EST singleton (not in UniGene) with exon hit 0.077 

25 331747 AA281765 Hs.1 93689 ESTs 0.077 

332792 CH22J=GENES.3J2 0.078 

330513 M81057 Hs.1 80884 carboxypeptidase B1 (tissue) 0.078 

308905 AJ859636 Hs.8102 ribosomaJ protein S20 0.078 

337419 CH22_FGENES75^4 0.078 

30 333459 CH22J=GENES.157_8 0.078 

334851 CH22_FGENES.440_3 0.078 

329046 CHJLhsgi|5868569 0.078 

327879 CH.06_hsgi|5868142 0.079 

305830 AA857665 EST singleton (not in UniGene) with exon hit 0.079 

35 302928 AL137719 EST cluster (not in UniGene) with exon hit 0.079 

304321 AA136698 Hs.1 13029 ribosomaJ protein S25 0.079 

326390 CH.19_hsgi|5867340 0.079 

335230 CH22_FGENES.514_2 0.08 

334622 CH22_FGENES.412_6 0.08 

40 335331 CH22_FGENES.535_4 0.08 

304753 AA578840 Hs.77961 major histocompatibility complex; class I; B 0.08 

301863 AI418863 EST cluster (not in UniGene) with exon hit 0.081 

336561 CH2a_FGENES.842_6 0.081 

335611 CH2^FGENES.583,5 0.081 

45 305060 AA635771 EST singleton (not in UniGene) with exon hit 0.081 

306051 AA905130 EST singleton (not in UniGene) with exon hit 0.082 

308289 A1571211 EST singleton (not in UniGene) with exon hit 0.082 

334365 CH22^FGENES.378_13 0.082 

335496 CH22_FGENES571_4 0.082 

50 332634 S38953 Human unidentified gene complementary to P450c21 

gene; partial cds 0.082 

337824 CH22_EM^C005500.G^SCAN.13-18 0.082 

335822 CH22_FGENES.619_7 - 0.082 

334758 CH22_FGENES.428_7 0.082 

55 309641 AW194230 Hs.253100 EST 0.082 

333064 CH22_FGENES.75_7 0.083 

338695 CH22_EM^C005500.GENSCAN.477-25 0.083 

331809 AA402482 Hs.97312 ESTs 0.083 

326138 CH.17 hsgi|5867203 0.083 

60 328304 CH.07Jisgi|6004478 0.083 

330570 U60276 Hs.165439 areA (bacterial) aisente 0.083 

334305 CH22_FGENES.373_8 0.083 

335885 CH22_FGENES.632_3 0.083 

325839 CH.16Jisgi|6552452 0.083 

65 333531 CH22_FGENES.175J8 0.084 

330385 AA449749 Hs.31386 ESTs; Highly similar to secreted apoptosis related protein 

1[H.sapiens] 0.084 

323305 AA811351 Hs.25307 Homo sapiens clone 24812 mRNA sequence 0.084 

331698 Z39929 Hs.65843 ESTs 0.084 
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335888 CH22_FGENES.633J2 0.084 

306008 AA894390 EST singleton (not in UnlGene) with exon hit 0.084 

334249 CH22LFGENES.365J5 0.084 

318303 AW451197 Hs.113418 ESTs 0.084 

5 330171 CH.02j)2gi|6648220 0.084 

336662 CH22_FGENES.41-1 0.085 

320506 AI815668 Hs.157476 sud -associated neurotrophic factor target 2 

(FGFR signalRng adaptor) 0.085 

316974 A1740721 H$.128292 ESTs 0.085 

10 336492 CH22_FGENES.832_9 0.085 

335750 CH22_FGENES.6Q2_4 0.085 

335676 CH22.FGENES.594J 0.086 

336093 CH22_FGENES.691_2 0.086 

310932 AJ933861 Hs.222852 ESTs 0.086 

15 335160 CH22_FGENES^02_4 0.086 

334306 CH22_FGENES.373_9 0.086 

334793 CH22J=GENES.433__5 0.086 

333936 CH22_FGBiES.301_2 0.087 

336413 CH22_FGENES.823,35 0.087 

20 333775 CH22 FGENES272J 0.087 

335971 CH22_FGENES.652_4 0.087 

301737 A1815981 EST duster (not in UniGene) with exon hit 0.087 

339101 CH22_DA59H18.GENSCAN.44-6 0.087 

327612 CH.04_hsgi|6525283 0.087 

25 326241 CH.17Jisgi|5B67260 0.088 

338386 CH22_EM;AC005500.GENSCAN.3314 0.088 

327762 CH.05J1S gi|5867961 0.088 

305266 AA679772 EST singieton (not In UniGene) with exon hit 0.088 

334359 CH22_FGENES.37B_4 0.088 

30 335500 CH22_FGENES£71J0 0.088 

329687 CH.14_p2 #117856 0.088 

333654 CH22_FGENES.240_2 0.088 

324430 AA464018 EST cluster (not In UniGene) 0.088 

325999 CH.16_hsgi|5867073 0.089 

35 334832 CH22.FGENES.439J 0.089 

339115 CH22_DA59H18.GENSCAN.49-3 0.089 

300896 AI916902 Hs.213882 ESTs 0.089 

328784 CH.07_hsgi(5868309 0.089 

335044 CH22_FGENES.480J 0.089 

40 329791 CH.14_p2gi|6469354 0.089 

333656 CH22_FGENES240_4 0.089 

326180 CH.17_hsgi|5867211 0.089 

333391 CH22_FGENES.144J> 0.089 

338324 CH22_EM*C005500.GENSCAN.306-3 0.089 

45 305396 AA721052 EST singleton (not In UniGene) with exon hit 0.089 

337483 CH22 FGENES.795-7 0.09 

326424 CH.19_hsgij5867369 0.09 

306454 AA977992 EST singleton (not in UniGene) with exon hit 0.09 

338893 CH22 DJ32I10.GENSCAN.7:6 0.09 

50 327470 CH.02Jisgi|5B67772 0.09 

333165 CH22_FGENES.91_7 0.09 

307155 AI186738 Hs.182426 ribosomal protein S2 0.09 

330717 AA233926 Hs23635 ESTs " 0.09 

335334 CH22_FGENES.535_10 0.09 

55 335907 CH22JFGENES.636_2 0.09 

333885 CH22_FGENES292_7 0.09 

331034 N51868 Hs.31965 ESTs; Moderately similar to 40S RIBOSOMAL 

PROTEIN S20 [H^apiens] 0.09 

304660 AA534416 Hs.162185 ESTs 0.09 

60 328217 CH.06 hsgi|5868096 0.091 

336068 CH22_FGENES.684J3 0.091 

302833 AA295381 Hs.44423 ESTs 0.091 

328668 CH.07 hsgi|5868254 0.091 

335309 CH22_FGENES.532_2 0.091 

65 338481 CH22_EM:AC005500.GENSCAN.377-5 0.091 

306286 AA936892 EST singleton (not in UniGene) with exon hit 0.091 

305070 AA639783 EST singleton (not in UniGene) with exon hit 0.091 

304870 AA594811 Hs.119122 ribosomal protein L1 3a 0.091 

303856 AA968589 Hs.944 glucose phosphate Isomerase 0.091 
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323789 AI459812 Hs.170460 ESTs; Weakly similar to KIAA0990 protein [H^aplens) 0.092 

334910 CH22_FGENES,455_3 0.092 

326382 CH.19_hsgi|5867327 0.092 

332467 AA489630 Hs.1 19004 KIAA0665 gene product 0.092 

5 338534 CH22 EMAC005500.GENSCAN.402-7 0.092 

336449 CH22^FGENES.829_6 0.092 

333709 CH22_FGENES.250J4 0.092 

336559 CH22_FGENES.842_4 0.092 

333230 CH22.FGENES.107J0 0.093 

10 333133 CH2^FGENES.83_9 0.093 

334885 CH22__FGENES.451_11 0.093 

330605 X02419 Hs.77274 plasminogen activator; urokinase 0.093 

336392 CH22J=GENES.823_4 0.093 

334083 CH22LFGENES.327_38 0.093 

15 325469 CH.12Jtsgi|6017Q34 0.093 

331077 R09531 Hs.19039 ESTs 0.093 . 

303701 AW500732 EST cluster (not In UniGene) with exon hit 0.093 

334218 CH22__FGENES.358_3 0.093 

336542 CH22 FGENES.840_6 0.093 

20 337151 CH22 FGENES.546-1 0.093 

333642 CH22_FGENES231_2 0.093 

336863 CH22_FGENES.297-4 0.093 

334680 CH22J=GENES.419_2 0.093 

326365 CH.18_hsgi|5867297 0.093 

25 338952 CH22J3J32I10.GENSCAN.23-22 0.093 

337539 CH2^FGENES.832-4 0.094 

333546 CH22 - FGENES.180_2 0.094 

335258 CH22_FGENES518_3 0.094 

336786 CH22J=GENES.168-19 0.094 

30 321644 AI204177 Hs.237396 ESTs 0.094 

335943 CH22 FGENES.646J7 0.094 

327918 CH.06_hsgi|5868165 0.094 

306398 AA970548 EST singleton (not in UniGene) with exon hit 0.094 

335671 CH2£_FGENES.592_3 0.094 

35 335033 CH22_FGENES.475_1 1 0.094 

338277 CH22.EM^C005500.GENSCAN^90-2 0.094 

332061 AA504812 Hs.1 92824 early B-cell factor 0.094 

305153 AA654582 Hs.77039 ribosomal protein S3A 0.094 

333880 CH2£J=GENES.292_2 0.094 

40 323940 AI864428 Hs.170880 ESTs 0.094 

313779 AA648796 Hs.129771 ESTs 0.095 

323109 AA169345 EST cluster (not in UniGene) 0.095 

332930 CH22_FGENES.38jt 0.095 

335368 CH22_FGENES.543_6 0.095 

45 303887 R72672 Hs.193484 ESTs; Weakly similar to Similarity with yeast gene 

L3502.1 [Oelegans] 0.095 

336223 CH22J=GENES.727_3 0.095 

311280 AI767957 Hs.197737 ESTs; Weakly similar to Y38A8.1 gene product [C.elegans] 0.095 

337258 CH22_FGENES.648-3 0.095 

50 308814 AI819263 EST singleton (not fn UniGene) with exon hit 0.095 

334659 CH2£J=GENES.418_7 0.095 

335895 CH22J=GENES.635_3 0.095 

321697 AW388061 Hs.4953 golgi autoantigen; golgin subfamily a; 3 • 0.095 

336010 CH22LFGENES.668_8 0.096 

55 302824 U21260 EST cluster (not In UniGene) with exon hit 0.096 

333612 CH22J=GENES.217_7 0.096 

304823 AA584837 EST singleton (not in UniGene) with exon hit 0.096 

335665 CH22 _FGENES.590_1 6 0.096 

306518 AA989598 EST singleton (not in UniGene) with exon hit 0.096 

60 335243 CH22JFGENES.516_4 0.096 

335436 CH22_FGENES.559_5 0.096 

300243 AI420256 Hs.161271 ESTs 0.096 

332810 CH22_FGENES.7J2 0.097 

308612 AI735634 EST singleton (not In UniGene) with exon hit 0.097 

65 335818 CH22 - FGENES.618_6 0.097 

325838 CH.16JHS gi|6552452 0.097 

337482 CH23_FGENES.79*6 0.097 

336645 CH22_FGENES.26-1 0.097 

337293 CH22.FGENES.675-1 0.098 
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329893 CH.15_p2gi|6525313 0.098 

326533 CH.19_hs gi|5867441 0.098 

334905 CH22J=GENES.452_20 0.098 
306347 AA961144 EST singleton (not in UniGene) with exon hit 0.098 

5 336676 CH22_FGENES.43-4 0.098 

339166 CH22_DA59H18.GENSCAN.69-7 0.098 * 

335774 CH22_FGENES.607_10 0.098 

339216 CH22_FF113D11.GENSCAN.6-11 0.098 

335311 CH22_FGBIES.532_4 0.098 

10 329632 CH.11_p2gi|6729060 0.098 

328595 CH.07_hsgi|5868224 0.098 

326928 CH.21_hsgi|6456782 0.098 

315234 AI079680 Hs.120770 ESTs 0.098 

306082 AA908508 EST singleton (not In UniGene) with exon hit 0.098 

15 305710 AA826544 EST singleton (not in UniGene) with exon hit 0.098 

318540 T30280 EST cluster (not in UniGene) 0.099 

337553 CH22_C4G1.GENSCAN.2-1 0.099 

320951 AA344069 Hs.202699 neurexophflin 4 0.099 

303845 T08033 EST cluster (not in UniGene) with exon hit 0.099 

20 338981 CH22_OA59H18.GENSCAN.2-5 0.099 

321313 R87365 Hs.26058 ESTs;Wealdysimaartorj532[H.sapiens] 0.099 

328348 CH.07Jisgi|5868383 0.099 

332203 H49388 Hs.1 02082 EST 0.099 * 

301780 R07064 EST cluster (not in UniGene) with exon hit 0.099 

25 332095 AA608838 Hs.162681 EST 0.099 

333227 CH22_FGENES.107_5 0.099 

316442 AA760894 Hs.153023 ESTs 0.099 

326001 CH.16Jisgi|5867073 0.099 

334363 CH22_FGENES.378J1 0.099 

30 338895 CH22JXJ32I10.GENSCAN.9-2 0.099 

327460 CH.02_hsgi|6004455 0.099 

332705 T59161 Hs.76293 thymosin; beta 10 0.1 

307806 AI351739 EST singleton (not in UniGene) with exon hit 0.1 

322800 F25037 Hs.225175 ESTs 0.1 

35 304918 AA602697 EST singleton (not in UniGene) with exon hit 0.1 

334327 CH22_FGENES.375_4 0.1 

318359 A1097439 Hs.135548 ESTs 0.1 

326644 CH.20_hsgl|5867559 0.1 

334454 CH22_FGENES.388_3 0.1 

40 327959 CH.06_hsgq5868210 0.1 

323783 AA330586 Hs.131819 ESTs 0.1 

309198 AI955915 Hs.248038 major histocompatifairity complex; class (; C 0.1 

339265 CH22_BA354l12.GENSCAN.10-3 0.1 

320576 AL049977 Hs.162209 Homo sapiens mRNA;cDNADKFZp564C122 

45 (from done DKFZp564C122) 0.1 

338132 CH22_EM:AC005500.GENSCAN.200-2 0.1 

333163 CH22_FGENES.91_5 0.101 

337584 CH22 C20H12.GENSCAN.5-1 0.101 

307588 AI285535 EST singleton (not in UniGene) with exon hit 0.101 

50 336969 CH22_FGENES.378-2 0.101 

327535 CH,02_hsgi|6525279 0.101 

328732 CH.07JS gi|5868289 0.101 

336686 CH22_FGENES.46-3 - 0.101 

335777 CH22_FGENES.607_13 0.101 

55 332944 CH22_FGENES.47_3 0.101 

333174 CH22_FGENES.95_1 0.101 

336380 CH22_FGENES.821_8 0.101 

330571 U60800 Hs.79089 soma domain; immunoglobulin domain (Ig); 

cytoplasmic domain; (semaphorin) 4D 0.101 

60 331789 AA398721 Hs.186749 ESTs 0.101 

338915 CH22_DJ32I10.GENSCAN.12-1 0.101 

334844 CH22_FGENES.439_24 0.101 

336642 CH22.FGENES.23-4 0.101 

334906 CH22_FGENES.452._21 0.101 
65 333188 CH22_FGENES.98_8 0.101 

300088 AW299993 EST duster (not in UniGene) with exon hit 0.101 

329373 CHJLhsgi|6682537 0.102 

331120 R46576 Hs.23239 ESTs 0.102 

335856 CH22_FGENES.628J 0.102 
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331888 AA431337 Hs.98017 ESTs 0.102 

333154 CH22.FGENES.89.4 0.102 

335989 CH22_FGENES.655_2 0.102 

304385 AA2356Q2 EST singleton (not in UniGene) with exon hit 0.102 

5 338016 CH22 EM:AC005500.GENSCAN.133-1 0.102 

335190 CH22_FGENES.507_5 0.102 

318595 T39486 Hs.6137 ESTs 0.102 

333697 CH22.FGENES.250J1 0.102 

306526 AA989713 EST singleton (not in UniGene) with exon hit 0.103 

10 328734 CH.07_hsgi|5868289 0.103 

307294 AI205612 Hs.73742 ribosomal protein; large; PO 0.103 

327424 CH.02_hsgi|5867751 0.103 

335872 CH22J=GENES.630_3 0.103 

333572 CH22.FGENES.189J 0.103 

15 334774 CH22J=GENES.430_6 0.103 

338660 CH22_EMAC005500.GENSCAN.462-1 0.103 

326713 CH2Q hsgi[5867595 0.103 

333994 CH22_FGENES.310J8 0.103 

335800 CH22_FGENES.613_4 0.103 

20 318113 AI187943 Hs,132322 ESTs 0.103 

337278 CH22.FGENES.665-1 0.103 

336386 CH22_FGENES.B22_6 0.103 

334790 CH22LFGENES.432J5 0.103 

303778 AW505368 EST duster (not in UniGene) with exon hit 0.104 

25 336524 CH22_FGENES.839_5 0.104 

328936 CH.08_hsgi|5868500 0.104 

335102 CH22_FGENES.494_7 0.104 

300935 AA513644 Hs.222815 ESTs; Weakly similar to Wiskott-Aldrich Syndrome 

protein [H^aplens] 0.104 

30 307581 AI284415 EST singleton (not in UniGene) with exon hit 0.104 

317301 AW291683 Hs.226056 ESTs 0.104 

335330 CH22 FGENES.535_3 0.104 

337968 CH2^EMAC005500.GENSCAN.103-2 0.104 

335627 CH22_FGENES.584_7 0.104 

35 336274 CH22_FGENES.762_2 0.104 

334730 CH2a_FGENES.424_5 0.105 

334409 CH22_FGENES.383_6 0.105 

327237 CH.01_hsgi|5867544 0.105 

333321 CH22J=GENES.138_13 0.105 

40 303181 AA452366 EST cluster (not in UniGene) with exon hit 0.105 

333738 CH22_FGENES.261_2 0.105 

338255 CH22_EM:AC005500.GENSCAN 276-3 0.105 

334282 CH22_FGENES.369J2 0.105 

330190 CR05j>2gil6165182 0.105 

45 310748 AW014249 Hs.158698 ESTs 0.105 

338150 CH22_EMAC00550UGENSCAN207-2 0.105 

336719 CH22_FGENES.82-6 0.105 

330228 CH.05ji2gil6013527 0.105 

327801 CH.05_hsgi|5867924 0.105 

50 330525 S75168 Hs£74 megakaryoctfe-associated tyrosine kinase 0.105 

334972 CH22_FGENES.468_2 0.105 

335111 CH22J=GENE$.494_19 0.106 

334483 CH22 FGBIES.395_5 - 0.106 

328829 CH.07_hsgt|5868337 0.106 

55 302753 M74299 EST cluster (not in UniGene) with exon hit 0.106 

334512 CH22_FGENES.398J0 0.106 

330024 CH.16j>2gi|6671908 0.106 

321030 AI769930 Hs.233617 Homo sapiens (clone B3B3E1 3) Huntington's 

disease candidate region 0.107 

60 338410 CH22_EM^C005500.GENSCAN^4H 0.107 

334353 CH22_FGBJES.376_5 0.107 

338276 CH22 EM:AC0G5500.GENSCAN.288-9 0.107 

329053 CHJ(_hsgi|5868574 0.107 

336560 CH22_FGENES.842_5 0.107 

65 332158 AA621363 Hs.112980 EST 0.107 

336447 CH22_FGENES.829_4 0.107 

333703 CH2^FGENES250J7 0.107 

326207 CH.17_hsgi|5867222 0.107 

333232 CH22_FGENES.108J 0.107 
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334802 CH22J=GENES.435_1 0.107 

303784 AA704983 EST duster (not in UniGene) with exon hit 0.107 

338847 CH22_DJ246D7.GENSCAN.10-2 0.107 

339407 CH22_DJ579N16.GENSCAN.1-9 0.108 

5 337635 CH22_C2GH12.GENSCAN.324 0.108 

334650 CH22_FGENES.417_17 0.108 

308511 AI687580 EST singleton (not in UniGene) with exon hit 0.108 

333392 CH22.FGENES.144.8 0.108 

325840 CH.16_hsgi|6552452 0.108 

10 315044 AW205664 Hs.129568 ESTs 0.108 

333298 CH22_FGENES.133_4 0.108 

335157 CH22J=GENES.501_7 0.108 

333305 CH22J=GENES.137_2 0.108 

326379 CH.19_hsgi|5867327 0.108 

15 335050 CH22_FGENES.482_1 0.108 

305185 AA663985 Hs.248038 major histocompatibility complex; class I; C 0.108 

335658 CH22_FGENES.590_9 0.108 

323040 AA336609 Hs.10862 ESTs 0.108 

337326 CH22_FGENES.699-6 0.108 

20 339262 CH22_BA354H2.GENSCAN.9-6 0.108 

321202 H54052 Hs.163639 ESTs; Wealdy similar to INTERCELLULAR ADHESION 

MOLECULE-1 PRECURSOR [H.sapiens] 0.109 

331792 AA398968 Hs.97548 EST 0.109 

333806 CH22J=GENES.278J2 0.109 

25 321325 AB033100 EST duster (not in UniGene) 0.109 

331373 AA435513 Hs.178170 ESTs; Weakfy similar to DUAL SPECIFICITY 

PROTEIN PHOSPHATASE 3 0.87 

326775 CH.O7_hsgil50683O9 0.109 

335105 CH22_FGENES.494_10 0.109 

30 300975 AI283548 Hs.149668 ESTs 0.109 

324893 T31940 EST cluster (not in UniGene) 0.109 

333397 CH22.FGENES.144J5 0.109 

336484 CH22_FGENES.831_3 0.109 

335507 CH22_FGENES.571_22 0.109 

35 336373 CH22_FGENES.820_3 0.109 

336188 CH22_FGENES.717_12 0.109 

313455 AW081702 Hs.137329 ESTs 0.109 

335185 CH22_FGENES506_4 0.109 

306814 AI066577 EST singleton (not in UniGene) with exon hit 0.109 

40 311130 AI632322 Hs.195306 ESTs 0.109 

310882 AW080339 Hs.211911 ESTs 0.109 

323383 AI346359 Hs.135209 ESTs 0.11 

300212 AW135925 Hs.184552 blphenylhydrolase-like (serine hydrolase; breast epithelial 

mucin-assoc, 0.11 

45 325675 CH.14_hsgi|5867014 0.11 

330095 CH.19_p2gi[6015278 0.11 

331942 AA453261 Hs.99309 ESTs 0.11 

334723 CH22_FGBIES421_34 0.11 

333614 CH22_FGENES.217_9 0.11 

50 337316 CH22_FGENES.692-1 0.11 

305057 AA635626 Hs.62954 ferritin; heavy polypeptide 1 0.11 

338704 Cr^EM^C005500.GENSCAN.480-3 0.11 

335385 CH22_FGENES.543_27 - 0.11 

338012 CH22^EMAC005500.GENSCAN.128-10 0.11 

55 329449 CH.YJisgi|5868886 0.11 

338980 CH22_DA59H18.GENSCAN.2-4 0.11 

336553 CH22_FGENES.841_10 0.111 

330021 CH.16_p2gi|6671889 0.111 

327579 CH.03_hsgi|5867824 0.111 

60 333099 CH22 FGENES.79 4 0.111 

337076 CH2?_FGENES.453-4 0.111 

331388 AA456652 Hs.43543 suppressor of white apricot homolog 2 0.111 

306674 AI005542 Hs.160414 heat shock 70kD protein 10 (HSC71) 0.111 

305949 AA884409 EST singleton (not in UniGene) with exon hit 0.111 

65 330748 AM19217 Hs.15911 0KFZP586E1422 protein 0.111 

333780 CH22_FGENES273_2 0.111 

323676 AI702835 EST cluster (not in UniGene) 0.111 

308952 AI868157 Hs.224226 EST 0.111 

309338 AW026946 Hs.181165 eukaryotic translation elongation factor 1 alpha 1 0.111 
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329317 CHXJisgi|6381976 0.112 

333518 CH22_FGENES.173_3 0.112 

306982 AI127883 EST singleton (not In UniGene) with exon hit 0.112 

336225 CH22_FGENES.728_2 0.112 

5 333698 CH22_FGENES.250J2 0.112 

302173 AI417947 Hs.14068 ESTs 0.112 

335510 CH22_FGENES.571_25 0.112 

328042 CH.06Jisgi|5902482 0.112 

336512 CH22_FGENES.834JT 0.112 

10 328541 CH.07_hsgi|5868486 0.112 

311265 AW205118 Hs.199214 ESTs 0.112 

323218 AF131846 Hs.13396 Homo sapiens clone 25028 mRNA sequence 0.112 

302002 AF013956 Hs.123085 chromobox homolog 4 (Drosophila Pc class) 0.112 

315088 AA557351 Hs.152448 ESTs; Moderately similar to MULTIFUNCTIONAL PROTEIN ADE2 0.112 

15 312581 AI937242 Hs.176590 ESTs 0.112 

322246 AW384710 Hs.125258 ESTs 0.112 

333659 CH22_FGENES,241J 0.113 

327510 CH.02_hs gi|6117815 0.113 

336520 CH22_FGENES.839_1 0.113 

20 338682 CH22_EM:AC0G5500.GENSCAN.472-1 0.113 

334508 CH22_FGENES.398J 0.113 

322533 T59538 EST cluster (not in UniGene) 0.113 

306873 AI086929 EST singleton (not in UniGene) with exon hit 0.113 ' 

336040 CH22_FGENES.679_2 0.113 

25 303898 T23215 EST cluster (not in UniGene) with exon hit 0.113 

312011 AW294868 Hs.187226 ESTs 0.113 

335186 CH22_FGENES.506_5 0.113 

333607 CH22_FGENES.216J 0.113 

305549 AA773530 EST singleton (not in UniGene) with exon hit 0.113 

30 333686 CH22_FGENES249J 0.113 

334352 CH22_FGENES.376J 0.113 

338195 CH22_EMiAOXE500.GENSCAN.233-18 0.114 

333588 CH22_FGENES.206_2 0.114 

339233 CH22_BA354l12.GENSCAN.2-3 0.114 

35 337455 CH22_FGENES.777-1 0,114 

309101 AI925108 EST singleton (not in UniGene) with exon hit 0.114 

328522 CH.07jlsgi|5868477 0.114 

323999 AI537333 Hs.252782 ESTs 0.114 

333517 CH22_FGENES.173_2 0.114 

40 329935 CH.16_p2gi|6165200 0.114 

326226 CH.17Jis gi|5867230 0.114 

335890 CH22_FGENES.633jt 0.114 

336715 CH22_FGENES.77-1 0.114 

327640 CH.04_hsgi|5867890 0.114 

45 338842 CH22_DJ246D7.GENSCAN.7-1 0.114 

306534 AA991487 EST singleton (not in UniGene) with exon hit 0.114 

336597 CH22_FGENES.266J 0.114 

321010 Y17456 Hs.227150 Homo sapiens LSFR2 gene; last exon 0.114 

302294 AA159213 Hs.5337 isocttrate dehydrogenase 2 (NADP+); mitochondrial 0.114 

50 324895 N44238 Hs.77515 inositol 1;4;5-tnphosphate receptor; type 3 0.114 

327358 CH.01_hsgij6552411 0.114 

308792 AI815153 Hs.195188 glycera!dehyde-3-phosphate dehydrogenase 0.115 

325886 CH.16Jisgi|5867087 - 0.115 

336850 CH22.FGENES.272-11 0.115 

55 305858 AA863103 EST singleton (not In UniGene) with exon hit 0.115 

302569 AC004472 multiple UniGene matches 0.115 

336158 CH2^FGENES.707J 0.115 

327866 CH.06Jsgi|5868131 0.115 

339157 CH22_DA59H18.GENSCAN.67-3 0.115 

60 339258 CH22_BA354l12.GENSCAN.8-3 0.115 

336129 CH22_FGENES.701J7 0.115 

333684 CH2*JGENES.249J 0.115 

309618 AW190162 Hs.184776 ribosoma! protein L23a 0.115 

312926 AA954097 Hs.127523 ESTs 0.115 

65 302640 AB035698 EST cluster (not In UniGene) with exon hH 0.115 

328968 CH.08Jisgi|6456775 0.115 

327902 CH.06_hsgi|5868158 0.115 

321927 AJ223366 EST cluster (not In UniGene) 0.115 

335962 CH22_FGENES.651_4 0.115 
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334927 CH22J=GENES.460J 0.115 

330535 U11B72 Human Interieukln-e receptortype B (IL8RB) mRNA, 

splice variant IL8RB1 0.856 

328591 CH.07J1SQH5868227 0,115 

5 334902 CH22J=GENES.452J6 0.115 

328525 CH.07Jt$gi|5868482 0.115 

325870 CH.16_hsgi|6682492 0.116 

337522 CH22_FGENES.819-1 0.116 

305079 AA641329 EST singleton (not in UniGene) with exon hit 0.116 

10 327343 CH.01Jtsgi|6017017 0.116 

333918 CH22_FGENES.296_7 0.116 

333600 CH22J=GENES.213J . 0.116 

335846 CH22_FGENES.623_6 0.116 

333510 CH22_FGENES.171_4 0.116 

15 327629 CR04Jisgi|5867872 0.116 

333470 CH22_FGENES.161_6 0.116 

326855 CU20_hsgi|6552460 0.116 

327008 CH.21Jnsgi|5867664 0.117 

337480 CH22_FGENES.795-3 0.117 

20 336425 CH22J=GENES.824J0 0.117 

321964 AL079687 Hs.171065 ESTs 0.117 

335651 CH22J=GENES.590_2 0.117 

308164 A1521574 Hs.181 165 eukan/otic translation elongation factor 1 alpha 1 0.117 

337927 CH2^.EM^C005500.GBISCAN.8(K3 0.117 

25 300341 H45095 Hs.153524 ESTs 0.117 

300154 AI245127 Hs.179331 ESTs 0.117 

306295 AA937331 EST singleton (not in UniGene) wfth exon hit 0.117 

329670 CH.14_p2gi|6272129 0.117 

335612 CH22J=GENES.583_6 0.117 

30 307845 AI363450 EST singleton (not in UniGene) with exon hit 0.117 

330401 D28383 Human mRNA for ATP synthase B chain, 5UTR (sequence from the 

5'cap to the start codon) 0.1 17 

327127 CR21Jisgl|6G82520 0.117 

333843 CH22.FGENES.290J 0.117 

35 331083 R17762 Hs.22292 ESTs 0.117 

329140 CKX_hsgi|6017060 0.117 

339338 CH2?_BA354l12.GENSCAN27-3 0.117 

331974 AA464518 Hs.99616 ESTs 0.117 

338631 CH2a.EM^C005500.GENSCAN.454-2 0.117 

40 330299 CH.06_p2gi)2905881 0.117 

330351 CH.09_p2gi|3Q56622 0.117 

305377 AA715714 Hs.181357 bminin receptor 1 (67kD; ribosomal protein SA) 0.117 

333106 CH2^.FGENES.79_12 0.117 

338514 CH2^EM'AC005500.GENSCAN592-4 0.117 

45 327335 CH.01Jtsgi|5902477 0.117 

301970 AB028962 Hs.120245 KIAA1 039 protein 0.118 

326339 CH.17Jisgi|6056311 0.118 

330612 X15673 Hs.93174 Human endogenous retrovirus pHE.1 (ERV9) 0.118 

334178 CH2SJ_FGENES.350j6 0.118 

50 328008 CH.06_hsgi|5902482 0.118 

329976 CR16_p2gi|4878063 0.118 

320952 AA897432 Hs.130411 ESTs 0.118 

305621 AA789095 EST singleton (not in UniGene) with exon hit - 0.118 

337850 CH22^EfvfcAC005500.GENSCAN.34-3 0.118 

55 333626 CH22J=GENES.224_2 0.118 

337672 CH22_EMAC000097.GENSCAN.67-1 0.118 

328803 CH.07_hsgi|6004475 0.118 

325922 CH.16 hsgi|5867122 0.118 

334489 CH22_FGENES.397J 0.118 

60 320638 R54766 Hs.101120 ESTs 0.118 

321932 AA569229 EST cluster (not in UniGene) 0.118 

336958 CH22.FGENES.367-1 0.118 

332082 AA600176 Hs.112345 ESTs 0.118 

306004 AA889992 EST singleton (not in UniGene) with exon hit 0.118 

65 336803 CH23.FGENES.194-1 0.118 

309107 AI925823 EST singleton (not in UniGene) with exon hit 0.118 

336859 CH22LFGENES.293-9 0.118 

337935 CH22_EM:AC005500.GENSCAN.85-6 0.118 

326492 CH.19_hsgi|5867422 0.118 
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327289 CH.01Jisgi|5867481 0.119 

325818 CH.14_hsgi|6682490 0.119 

310787 AW262580 Hs.159040 ESTs 0.119 

330028 CH.16_p2 gi)6671908 0.1 19 

5 325317 CH.11Jisgi|5866878 0.119 

335279 CH22_FGENES.523__7 0.119 

331720 AA192173 Hs321530 ESTs 0.119 

329186 CH.X_hsgi|5868711 0.119 

316012 AA764950 Hs.1 19898 ESTs 0.119 

10 338316 CH22_EM:AC005500.GENSCAN.304-2 0.119 

326033 CH.17_hs gi|5867178 0.1 19 

334745 CH22_FGENES.426_3 0.119 

333051 CH22_FGENES.73_5 0.119 

301763 R01279 EST duster (not in UniGene) with exon hit 0.12 

15 304502 AA454809 Hs.1 72928 collagen; type 1; alpha 1 0.12 

335680 CH22_FGENES.594_5 0.12 

304678 AA548556 EST singleton (not in UniGene) with exon hit 0.12 

335441 CH22J=GENES.560_4 0.12 

336187 CH22J=GENES.717J1 0.12 

20 309422 AW087175 EST singleton (not in UniGene) with exon hit 0.12 

336047 CH2£_FGENES.679_9 0.12 

309651 AW195850 EST singleton (not in UniGene) with exon hit 0.12 _ 

308547 A1695385 Hs.201903 EST 0.12 

304443 AA399444 EST singleton (not in UniGene) with exon hit 0.12 

25 336245 CH2*J=GENES.746_3 0.12 

302703 H72333 EST cluster (not In UniGene) with exon hit 0.12 

335690 CH22_FGENES.596_5 0.12 

328941 CH.08_hsgi|6456765 0.12 

333873 CH22_FGENES.291_9 0.12 

30 317246 AW105092 Hs.155690 ESTs 0.12 

339288 CH22_BA354i12.GENSCAN.16-6 0.12 

337996 CH22_EMAC005500.GENSCAN.116-3 0.12 

333304 CH22.FGENES.137J 0.121 

308332 AI591235 EST singleton (not In UniGene) with exon hit 0.121 

35 329319 CRX_hsgi|6381976 0.121 

302086 X57138 multiple UniGene matches 0.121 

333290 CH22_FGENES.129_2 0.121 

323825 A1793O80 Hs.123525 ESTs; Weakly similar to NEUTROPHIL GELATINASE-ASSOCIATED 

UPOCAUN PRECURSOR [Rjwrvegicus] 0.121 

40 330575 U64105 Hs.252280 Rho guanine nucleotide exchange factor (GEF) 1 0.121 

305274 AA679990 Hs.181 165 eukaryotic translation elongation factor 1 alpha 1 0.121 

333647 CH22_FGENES.235_2 0.121 

302251 AA333340 EST cluster (not in UniGene) with exon hit 0.121 

329777 CH.14_p2gij6002090 0.121 

45 333155 CH22_FGENES.89_5 0.121 

326122 CH.17 hsgi|5867194 0.121 

335310 CH22_FGENES.532_3 0.121 

335453 CH22.FGENES.562J3 0.122 

305103 AA643329 Hs.1 11334 ferritin; light polypeptide 0.122 

50 337284 CH22_FGENES.667-2 0.122 

337418 CH22.FGENES.758-4 0.122 

313073 AI963740 Hs.46826 ESTs 0.122 

303759 AW504164 EST cluster (not in UniGene) with exon hit - 0.122 

300017 

55 M33197 AFFX control: GAPDH 0.122 

316725 AW135084 Hs.127264 ESTs 0.122 

330738 AA293153 Hs.120980 nuclear receptor co-repressor 2 0.122 

336466 CH22_FGENES.829_25 0.122 

335956 CH22_FGENES.647_3 0.122 

60 315308 AA780564 Hs.189053 ESTs 0.122 

. 338925 CH22JW32110.GENSCAN.14-3 0.122 

334969 CH22_FGENES.466_2 0.122 

322050 AL137589 EST cluster (not In UniGene) 0.122 

339084 CH22_DA59H18.GENSCAN.38-2 0.122 

65 338323 CH2^_EM^C005500.GENSCAN.306*2 0.122 

337003 CH22_FGENES,419-7 0.122 

325470 CH.12_hsgi|6017Q34 0.123 

336503 CH22_FGENES.833_10 0.123 

330786 D60374 Hs.258712 EST 0.123 



240 



WO 02/30268 



PCT/US01/32045 



10 



15 



20 



25 



30 



35 



40 338986 
328311 
337241 
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329446 CH.YJtsgi|5868886 
303326 AA229433 Hs.222634 ESTs; Moderately similar to ubkjulfin-like protein/ 
. ribosoma! protein S30 
AI916313 HS212788 EST 
AA968472 Hs.130463 ESTs 

CH.07_hsgi|5868301 
CH.17_hsgi|5867178 
CROIJisgi 5867447 
CH.17Jisgi|5916395 
CH.02_hs 0^ 6117815 
CH22^EM.AC005500.GENSCAN.336^ 
AA527782 Hs.84298 C074 antigen (invariant polypeptide of major 



309067 
317464 
328755 
326036 
327208 
326124 
327509 
338398 
304652 

335797 
336714 
327204 
331881 
306971 
336174 
336126 
329129 
303049 
335778 
336601 
334340 
337436 
306013 
339213 
335355 
336552 
336384 
310485 
335840 
336444 
315703 
327763 



AA430672 Hs.123778 
AI126509 



AW407562 
AA896990 

AI286202 Hs.149800 
N36070 



333496 



313483 
326116 
330450 
307491 
331852 

330462 
304410 
336385 
336793 
326243 
327266 
320753 
336960 
329667 
328168 
336534 



309230 
339190 
337086 
316233 



331930 



AW294432 Hs.144252 

HG363-HT363 
AI268539 

AA418988 Hs.98314 

HG944-HT944 
AA284508 



AF070579 Hs.181544 



AI970747 

R21054 Hs.211522 

AA449077 Hs.179765 
AI475914 



CH22 FGENES.612J 
CH22J=GENES.76-29 
CH.01Jisgi|5867447 
ESTs 

EST singleton (not In UniGene) with exon hit 

CH22.FGENES.710J 

CH22_FGENES.701J3 

CHJLhsgi|6588Q26 

EST duster (not in UniGene) with exon hit 

CH22_FGENES.607J4 

CH22_FGENES.369_2 

CH22_FGENES.375_17 

CH22_FGENES.767-1 

EST singleton (not in UniGene) with exon hit 

CH22 J=F1 13D1 1 -GENSCAN.6-8 

CH22_FGENES.541_2 

CH22_FGENES.841_9 

CH22_FGENES.822_4 

ESTs 

CH22_FGENES.622_3 

CH22.FGENES.827J0 

EST cluster (not in UniGene) 

CH.05Jisgi|5867961 

CH22_FGENES.822_3 

CH22.FGENES.168.6 

CH.07_hsgi|60O4473 

CH22_DA59H18.GENSCAN.5-1 

CH.07Jisgl|586B371 

CH22.FGENES.644-2 

CH22_FGENES.350-7 

ESTs 

CH.17J1S gi|5867193 

Epidermal Growth Factor Receptor-Related Protein 

EST singleton (not in UniGene) with exon hit 

Homo sapiens mRNA; cDNA DKFZp586L0120 

(from done DKFZp586L0120) 

Dopamine Receptor D4 

EST singleton (not in UniGene) with exon hit 

CH22_FGENES.822_5 

CH22J=GENES.176-3 

CH.17 hsgi|5867261 

CH.01Jisgi)5867462 

Homo sapiens clone 24487 mRNA sequence 

CH22J=GENES.369-5 

CH.14_p2gi|6272129 

CH.06_hsgi|5868071 

CH22.FGENES.839J6 

CH22_BA354l12.GENSCAN.16-9 

EST singleton (not in UniGene) with exon hit 

CH2£_FF1 13D1 1 .GENSCAN.1-2 

CH22_FGENES.458-14 

ESTs 

CH22_BA232E17.GENSCAN.6-8 

Homo sapiens mRNA; cONADKFZp586H1921 

(from clone DKFZp586H192 

EST singleton (not in UniGene) with exon hit 
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338477 
334286 

317245 AI025039 

335249 

333327 

304240 AA0O9802 
335464 
335236 
334154 

309257 AI984183 
310015 AI220122 



328280 
305744 
327430 
328323 
333274 
337193 
334820 
328706 
331228 
307205 
337123 
326201 
335276 
331202 
330532 
321235 
301743 
328175 
306407 
327145 
327649 
335142 
333909 
330608 



AA831819 



W67267 
AI192479 



T81115 
U03187 
N49521 
F12605 

AA971985 



X04325 



330158 

320153 AF064594 
314407 AA098835 
333383 

320663 A1734242 
326233 



335174 

319843 H29920 

335458 

332997 

334188 

329759 

330348 

326958 

305263 AA679467 
337693 
326812 
333237 



336499 

320087 AF032387 
309989 AI184186 
301490 AW298468 
337011 

315052 AA876910 
301611 W22172 
336497 

302068 Y16280 
334502 



CH22_EM:AC0G55OO.GENSCAN.373-5 

CH22J=GENES.369J6 
Hs.131732 ESTs 

CH22_FGENES.516_10 

CH22_FGENES.138_20 

EST singleton (not In UniGene) with exon hit 

CH22_FGENES.562_26 

CH22_FGENES.515_8 

CH22_FGENES.340_4 

EST singleton (not in UniGene) with exon hit 
Hs.201 981 ESTs; Weakly similar to breast carcinoma-associated antigen 

[H.sapiens] 

CH.07Jisgi|5868352 

EST singleton (not in UniGene) with exon hit 
CKG2_hsgip867754 
CH.07Jtsgi|5868373 
CH22_FGENES.123_2 
CH22 FGENES.575-3 
CH22_FGENES.437_2 
CH.07Jisgl|5868270 
ESTs 

EST singleton (not in UniGene) with exon hit 
CH22JH3ENES.519-3 
CH.17Jisgi|5867216 
CH22 U FGENES,523_2 
ESTs 

interieukin 12 receptor; beta 1 
EST cluster (not in UniGene) 
ESTs; Weakly similar to reverse transcriptase [H^apiens] 
CH.06_hsgi|5868073 

EST singleton (not in UniGene) with exon hit 
CH.01_hsgi[5867548 
CH.04_hsgi|5867899 
CH22J=GENES.498J2 
CH22J=GENES.295_2 

gap junction protein; beta 1 ; 32kD (connexln 32; 
Charcot-Marie-Tooth neuropathy; X-linked) 
CH21_p2gi|6580367 
phospholipase A2; group VI 
ESTs 

CH22J=GENES.143_22 
ESTs 

CH.17_hsgi|5867232 
CH.20_hsgi|5867634 
CH22_FGENES.504_4 
ESTs; Weakly similar to aralarl [H.sapiens] 
CH22 FGENES.562 18 
CH22_FGENES.58_4 
CH22_FGENES.352_3 



Hs.174911 



Hs.191136 
Hs.121544 

Hs.204529 



H&2679 



Hs.120360 
H&224432 

Hs.244473 



Hs.99486 



CH.14_p2gi 



6048280 



311496 AI768677 Hs.209888 



CH.09_p2gi|4544475 
CU21_hsgi 6469836 

EST singleton (not In UniGene) with exon hit 
CH22_EMAC000097.GENSCAN.78-14 
CH.20_hsgl|6682504 
CH22_FGENES.108_7 
CH22_FGENES.250J3 
ESTs; Weakly simitar to phosphatidylserine 
synthase-2 [M jnusculus] 
CH22.FGENES.833J 

small nuctear RNA activating complex; polypeptide 4; 1 90kD 
ESTs 
ESTs 

CH22_FGENES.427-6 
ESTs 
ESTs 

CH22_FGENES.833_2 
Hs.132049 endotrietin type breceptor-fixe protein 2 
CH22.FGENES.397J8 



Hs.1 13265 
Hs.197813 
Hs.250461 

Hs.134427 
Hs.59038 



0.126 
0.126 
0.126 
0.126 
0.126 
0.126 
0.126 
0.126 
0.126 
0.126 

0.126 

0.126 

0.126 

0.126 

0.126 

0.126 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 ' 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 

0.127 
0.127 
0.127 
0.127 
0.127 
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0.128 
0.128 
0.128 
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0.126 
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0.128 
0.128 
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0.128 
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0.129 
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304332 AA158884 EST singleton (not In UniGene) with exon hit 0.129 

304522 AA465405 EST singleton (not In UniGene) with exon hit 0.129 

312407 R46180 Hs.153485 ESTs 0.129 

310098 AI685841 Hs.161354 ESTs 0.129 

5 301119 AF142579 EST duster (not In UniGene) with exon hit 0.129 

309268 AI985821 Hs.62954 ferritin; heavy polypeptide 1 0.129 

330989 H42142 Hs£26396 DEAD/H (Asp-G!u-Ala-Asp/His) box polypeptide 19 

(DbpS; yeast; homolog) 0.129 

336949 CH22J=GENES.361-4 0.129 

10 330115 CH.19_p2gi|60152Q2 0.129 

339212 CH22_FF113D11.GENSCAN.6-7 0.129 

326951 CH.21Jtsgi|6004446 0.129 

305165 AA662939 EST singleton (not tn UniGene) with exon hit 0.129 

308238 AI559492 EST singleton (not In UniGene) with exon hit 0.129 - 

15 337140 CH22_FGENES.537-5 0.13 

321758 U29112 EST cluster (not in UniGene) 0.13 

304619 AA515554 Hs.119598 ribosomal protein L3 0.13 

312469 AA745289 Hs.173088 ESTs 0.13 

339017 CH22_DA59H18.GENSCAN.20* 0.13 

20 330116 CH.19_p2gi|6015202 0.13 

333312 ■ CH22_FGENES.138_4 0.13 

338004 CH22LEM^C005500.GB4SCAN.121-1 0.13 _ 

314141 AA232134 Hs.190028 ESTs 0.13 

300509 AI239845 Hs.128494 ESTs; WeaJdy similar to EG.-95B72 [Djnelanogaster] 0.13 

25 338530 CH22 EMAC005500.GENSCAN.398-1 1 0.13 

335968 CH22J=GENES.652_1 0.13 

314121 A1732100 Hs.187619 ESTs 0.13 

337593 CH22_C20H12.GENSCAN.6-8 0.13 

332881 CH22_FGENES.33J 0.13 

30 305836 AA858043 EST singleton (not In UniGene) with exon hit . 0.13 

339059 CH22_DA59Hl8.GENSCAN.30-5 0.13 

305610 AA782319 EST singleton (not in UniGene) with exon hit 0.13 

305852 AAB62455 EST singleton (not in UniGene) with exon hit 0.13 

327409 CH.02_hs gi|5867750 0.1 3 

35 312751 AI613089 Hs.164178 ESTs 0.13 

308726 AI799268 Hs.209929 EST 0.13 

325961 CH.16Jisgi|5867147 0.13 

311159 AW025919 Hs.197636 ESTs 0.13 

322715 AA057230 Hs.182135 ESTs 0.13 

40 336441 CH22_FGENES.827_7 0.13 

336339 CH22J=GENES.814_12 0.13 

306911 AI095365 EST singleton (not in UniGene) with exon hit 0.13 

333613 CH22_FGENES.217_8 0.13 

338489 CH2a_EM^C005500.GENSCAN.384-17 0.131 

45 326904 CR21Jisgi|5867684 0.131 

337337 CH22.FGENES.717-1 0.131 

326752 CK20Jisgi|5867615 0.131 

303977 AW512978 EST singleton (not in UniGene) with exon hit 0.131 

301373 AA595235 EST duster (not in UniGene) with exon hit 0.131 

50 338448 CH22_EMiAC005500,GENSCAN.359-22 0.131 

333774 CH22_FGENES.272_5 0.131 

332986 CH22_FGENES.54_8 0.131 

335362 CH22J=GENES-541_12 - 0.131 

335896 CH22_FGENES.635_4 0,131 

55 337825 CH22_EMAC005500.GEr4SCAN.13-19 0.131 

325257 CH.11_hsgl|5866895 0.131 

331188 T50240 Hs.167837 ESTs 0.131 

330645 Y08302 Hs.144879 dual spedfidty phosphatase 9 0.131 

331760 AA292721 Hs.154434 ESTs; Weakly similar to unknown [H.saplens] 0.131 

60 322995 AA513829 Hs.29797 ribosomal protein L10 0.131 

335497 CH22.FGENES.571 5 0.131 

334824 CH22_FGENES.437_6 0.131 

319480 R06933 Hs.184221 ESTs 0.131 

334842 CH22J=GENES.439__21 0.131 

65 333335 CH22J=GENES.139_4 0.131 

317252 AA905178 Hs.130124 ESTs 0.131 

329034 CHX_hsgl|5868561 0.131 

305186 AA664230 EST singleton (not In UniGene) with exon hit 0.131 

335755 CH22_FGENES.604_4 0.131 
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302143 H15270 Hs.189847 putative neuronal cell adhesion molecule 0.131 

334939 CH22_FGENES.465_3 0.131 

318994 C15110 Hs.178Q2 ESTs 0.131 

334498 CH22JGENES.397 14 0.131 

333413 CH22_FGENES.146_2 0.132 

329676 CH.14_p2gi|6272128 0.132 

327277 CH.01_hsgij5867473 0.132 

305022 AA627416 EST singleton (not in UniGene) with exon hit 0.132 

336805 CH22.JFGENES.196-3 0.132 

320121 T93657 EST cluster (not In UniGene) 0.132 

334761 CH22_FGENES.428_10 0.132 

339400 CH22_BA232E17.GENSCAN.7-6 0.132 

330301 CH.06_p2gi|2905862 0.132 
316822 AA827691 Hs.129967 ESTs; Weakly similar to neuronal thread protein 

AD7c4TTP[H.sapiens] 0.132 

328020 CH.06_hsgi|5902482 0.132 

325327 CH.11_hsgi|5866875 0.132 

321163 AA209530 EST cluster (not in UniGene) 0.132 

336393 CH22_FGENES.823_5 0.132 

325905 CH.16_hsgi|58671G4 0.132 

305237 AA676286 Hs.2186 eukaryotic translation elongation factor 1 gamma 0.132 

339046 CH22_OA59H18.GENSCAN.28-6 0.132 

325375 CH.12Jisgi|5866920 0.132 

333961 CH22_FGENES.304_7 0.132 

335450 CH22_FGENES.562JB 0.133 

302286 R58438 EST cluster (not in UniGene) with exon hit 0.133 

3351 16 CH22_FGENES.496_3 0.133 

327333 CH.01_hsgl|5902477 0.133 

308070 AI470948 EST singleton (not in UniGene) with exon hit 0.133 

EST singleton (not in UniGene) with exon hit 0.133 

Hs208839 ESTs 0.133 

EST cluster (not in UniGene) 0.133 

CH.07_hsgi|5868373 0.133 

EST cluster (not in UniGene) 0.133 

CH22.FGENES.3J 0.133 

Hs.162108 ESTs 0.133 

Hs.224868 ESTs 0.133 

Hs.170187 ESTs 0.133 

CH22_FGENES.3Q2_2 0.133 

Hs.130901 ESTs 0.133 

Hs26492 beta-1;3-glucuronyltransferase 3 (glucuronosyitransferase i) 0.133 

CH22_FGENES.36-5 0.133 

CH22_DJ32I10.GENSCAN.6-10 0.133 

Hs.181165 eukaryotic translation elongation factor 1 alpha 1 0.133 

CH22.FGENES.1 83_2 0.134 

Hs.163312 ESTs 0.134 

CH22_FGENES283_1 0.134 

CH.07Jisgi|5868262 0.134 

EST cluster (not in UniGene) with exon hit 0.134 

CH22_FGENES342_2 0.134 

CH22_FGENES.513_5 0.134 

CH22_EM:AC005500.GENSCAN.179-3 0.134 

337384 CH22_FGENES.745-1 - 0.134 

327360 CH.01Jis gi 655241 1 0.134 

328132 CH.06_hsgi 5868038 0.134 
323604 AI751438 Hs.182827 ESTs; Weakly similar to HI! ALU SUBFAMILY SQ 

WARNING ENTRY IU! 0.134 

337591 CH22_G20H12.GENSCAN.fr6 0.134 

307018 AI140639 EST singleton (not in UniGene) with exon hit 0.134 

326896 CH.21Jisgi|5867680 0.134 

333479 CH22__FGENES.163_5 0.134 

337915 CH223MC005500.GENSCAN.61-3 0.134 

335110 CH22_FGENES.494_18 0.134 

333481 CH22_FGENES.163_9 0.134 

327512 CH.Q2J»sgl|6117815 0.134 

300096 AW328639 Hs.83575 ESTs; Weakly similar to ZC328.3 [C.eiegans] 0.134 

330163 CH.02j>2gi|6042042 0.135 

335752 CH22_FGENES.604J 0.135 

334857 CH22_FGENES.443J 0.135 



328318 
320603 
332791 
314976 



A1470948 
A1581855 
AW360847 
AW248307 

R51419 

AA524725 
AL134164 

R39753 

AI733512 
F02383 



338887 

305273 AA679979 
333566 

316952 AW450033 

333818 

328687 

302879 H11802 

336557 

335222 



320581 
333944 
317992 
330935 
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301872 H84730 EST cluster (not In UniGene) with exon hit 0.135 

337529 CH22_FGENES.823-29 0.135 

335734 CH22_FGENES.601 4 0.135 

337551 CH22_FGENES.847-8 0.135 

5 309078 AI920965 Hs.77961 major histocompatibility complex; class I; B 0.135 

335513 CH22_FGENES.571_28 0.135 

339078 CH22J)A59H18.GENSCAN.37-6 0.135 

321907 N56660 Hs.148722 ESTs; Weakly similar to large tumor suppressor 1 [H.sapians] 0.135 

337189 CH22.FGENES.571-32 0.135 

10 329635 CH.12j)2gi|5302817 0.135 

308601 A1719930 EST singleton (not in UniGene) with exon hit 0.135 

305Q20 AA627248 Hs.2064 vimentin 0.135 

333894 CH22_FGENES.293J 0.135 

322465 M137152 Hs.3784 ESTs; Highly similar to phosphoserine aminotransferase 

15 [lisapiens] 0.135 

305601 AA780975 EST singleton (not In UniGene) with exon hit 0.135 

332186 H10781 Hs.141051 ESTs; Moderately similar to II!! ALU SUBFAMILY SB 

WARNING ENTRY 0.135 

327822 CH.05Jisgi|5867968 0.135 

20 310087 AI393914 Hs.160624 ESTs; Weakly similar to similar to CR16;SH3 domain 

binding protein 0.135 

328752 CHJJ7_hsgi|5868298 0.135 _ 

337611 CH2a_C20H12.GENSCAN.194 0.135 ~ 

334470 CH22_FGENES.394_1 0.136 

25 335115 CH22_FGENES.496_2 0.136 

328730 CH.07_hs gi|5868289 0.136 

330350 CH.09j)2gi|3056622 0.136 

336971 CH22.FGENES.378-6 0.136 

308258 AI565612 EST singleton (not in UniGene) with exon hit 0.136 

30 326745 CH.20Lhsgi|5867611 0.136 

335440 CH22_FGENES.560_3 0.136 

320257 AA330746 EST cluster (not in UniGene) 0.136 

328677 CH.07_hsgi|5868256 0.136 

329731 CH.14jJ2gi[6065783 0.136 

35 315950 AA700553 Hs.206974 ESTs 0.136 

330049 CH.17_p2gi|4567182 0.136 

337070 CH22.FGENES.448-3 0.136 

304095 H1 1324 Hs.31059 EST 0.136 

309304 AW005527 Hs.232820 EST 0.136 

40 333458 CH2a_FGENES.157_7 0.136 

329899 CH.15_p2gi|6563505 0.136 

322202 AI275056 Hs.200133 ESTs 0.136 

333991 CH22_FGENES310_15 0.136 

318617 AW247252 Hs.75514 nucleoside phosphorytase 0.136 

45 310623 AI341586 Hs.195588 ESTs 0.136 

330489 M23323 Hs.3003 CD3E antigen; epsflon polypeptide (TiT3 complex) 0.136 

309646 AW194694 EST singleton (not in UniGene) with exon hit 0.136 

331068 R00071 Hs.191199 ESTs 0.136 

334285 CH22.FGENES.369J5 0.136 

50 332178 F13689 Hs.100725 EST 0.136 

305724 AA827608 EST singleton (not in UniGene) with exon hit - 0.136 

303158 AL138110 Hs.8594 Homo sapiens mRNA containing (CAG)4 repeat; clone CZ-CAG-7 0.136 

334543 CH22 FGENES.403 8 - 0.136 

335384 CH22_FGENES-543_26 0.136 

55 336527 CH22_FGENES.839_8 0.136 

334951 CH22J=GENES.465_20 0.136 

325882 CH.16_hsgi|5867087 0.137 

305134 AA653159 EST singleton (not in UniGene) with exon hit 0.137 

307058 AI148709 EST singleton (not in UniGene) with exon hit 0.137 

60 331943 AA453418 Hs.178272 ESTs 0.137 

331116 R44780 Hs.22634 ESTs 0.137 

306094 AA908877 EST singleton (not in UniGene) with exon hit 0.137 

333561 Cr^FGENES.IBOJS 0.137 

321439 H61962 EST cluster (not in UniGene) 0.137 

65 324594 AA497090 EST cluster (not in UniGene) 0.137 

337926 CH22_EMJ\C005500.GENSCAN.77-4 0.137 

337353 CH22_FGBiES.726-1 0.137 

331836 AA412295 Hs.104774 EST 0.137 

308981 AI873242 EST singleton (not in UniGene) with exon hit 0.137 
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329424 CH.Y_hsgi|5868879 0.137 

325829 CH.15_hsg{|5867052 0.137 

331845 AA416863 Hs.98183 ESTs 0.137 

333854 CH22„FGENES.290_13 0.137 

5 306591 AI000248 EST singleton (not In UniGene) with exon hit 0.137 

328948 CH.08_hsgi 6456765 0.137 

338935 CH22J5J32I10.GENSCAN.18-12 0.137 

325960 CH.16_hsgi 5867147 0.137 

328377 CH.07_hsgi 5868390 0.138 

10 308851 AI829820 EST singleton (not In UniGene) with exon hit 0.138 

314620 AA424352 Hs.210586 ESTs 0.138 

337592 CH22_C20H12.GENSCAN.6-7 0.138 

338684 CH22_EM:AC005500.GENSCAN.472-3 0.138 

331800 AA400498 Hs.97543 ESTs 0.138 

15 304587 AA505535 EST singleton (not in UniGene) with exon hit 0.138 

333981 CH22J=GENES.310_4 0.138 

332452 AA040369 Hs.11170 SYT interacting protein 0.138 

305752 AA835278 EST singleton (not In UniGene) with exon hit 0.138 

311947 T65554 Hs.251591 EST 0.138 

20 333783 CH22 FGENES.273_5 0.138 

337406 CH22.FGENES.754-14 0.138 

327976 CH.06Jisgi|5868212 0.138 . 

325593 CH.13_hsgi|5866992 0.138 ~" 

339425 CH22_DJ579N16.GENSCAN.14-4 0.138 

25 304475 AA428879 EST singleton (not in UniGene) with exon hit 0.138 

309488 AW131104 EST singleton (not In UniGene) with exon hit 0.138 

337532 CH22_FGENES.827-6 0.138 

317234 AA904448 Hs.126368 ESTs 0.138 

312261 AA854425 Hs.144455 ESTs 0.138 

30 328927 CH.08Jisgi|5868500 0.138 

336424 CH22_FGENES.824_9 0.138 

326667 CH.20J1S gi|6552455 0.138 

325988 CH.16_hsgi|5867064 0.138 

318446 AW300287 EST cluster (not In UniGene) 0.139 

35 338511 CH22_FGENES.834_6 0.139 

335204 CH22.FGENES.508J3 0.139 

303244 AA147472 EST duster (not in UniGene) with exon hit 0.139 

330870 AA1 15804 Hs.187593 ESTs 0.139 

329376 CHJLhsgi|5868859 0.139 

40 304703 AA563898 EST singleton (not in UniGene) with exon hit 0.139 

333653 CH22.FGENES.239J2 0.139 

306799 AI051696 EST singleton (not in UniGene) with exon hit 0.139 

304872 AA595289 EST singleton (not in UniGene) with exon hit 0.139 

330812 AA013001 Hs.60563 ESTs 0.139 

45 329568 CH.10_p2gI|3962490 0.139 

319210 AA253074 Hs.146261 ESTs 0.139 

334320 CH22^FGENES.374_5 0.139 

300860 AI916949 Hs.149748 ESTs; Weakly similar to weak similarity to collagens [Celegans] 0.139 

305866 AA864533 EST singleton (not in UniGene) with exon hit 0.139 

50 312943 AA984364 Hs.119064 ESTs 0.139 

330523 M99439 Hs.83958 transdudn-like enhancer of split 4; homolog of Drosophila E(sp1) 0.139 

312708 AI076204 Hs.135440 ESTs 0,139 

309366 AW072970 EST singleton (not in UniGene) with exon hit - 0.139 

303273 AA316069 EST cluster (not in UniGene) with exon hit 0.139 

55 317484 AW274696 Hs.143921 ESTs 0.139 

333239 CH22 _FGENES.1 1 1_1 0.139 

307126 AI184951 EST singleton (not in UniGene) with exon hit 0.139 

316813 AA826505 Hs.124517 ESTs 0.139 

331746 AA281365 Hs.121640 ESTs; Weakly similar to KIAA0386 [Ksapiens] 0.139 

60 308558 AI700145 Hs.172182 poly(A)-btnding protein; cytoplasmic 1 0.139 

310784 AW086142 Hs.159017 ESTs 0.139 

323831 AA335715 Hs.200299 ESTs 0.139 

307692 AI318342 EST singleton (not in UniGene) with exon hit 0.139 

310570 A1318327 EST cluster (not in UniGene) 0.139 

65 327934 CH.06_hsgi]5868184 0.139 

305232 AA670052 Hs.195188 glyceraJdehyde-3?hosphate dehydrogenase 0.139 

334756 CH22.FGENES.428J5 0.139 

331938 AA451867 Hs.99255 ESTs 0.139 

301393 AI474722 Hs.150898 ESTs; Weakly similar to KIAA0644 protein [H.sapiens] 0.139 
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312005 T78450 Hs.13941 ESTs 0.139 

338431 CH22^EM^C005500.GENSCAN^514 0.14 

331214 T9G496 Hs.16757 ESTs 0.14 

333601 CH22_FGENES.213_4 0.14 

5 323481 AA278449 Hs.137429 ESTs 0.14 

336911 CH22_FGENES.344-4 0.14 

338157 CH22_EM:AC005500.GENSCAN.209-5 0.14 

327845 CH.05_hsgil6531962 0.14 

319109 Z45662 Hs.90797 Homo sapiens clone 23620 mRNA sequence 0.14 

10 334763 CH22J=GENES.428J2 0.14 

329384 CH.XJisgi|5868869 0.14 

302996 AF054663 EST duster (not in UniGene) with exon hit 0.14 

323751 AW452656 Hs^Q9824 ESTs 0.14 

329916 CH.16_p2gi|6223624 0.14 

15 301993 N49826 Hs.18602 ESTs 0.14 

338129 CH22_B4AC005500.GENSCAN.197-2 0.14 

325704 CH.14Jsgi|5867028 0.14 

335656 CH22_FGENES.590_7 0.14 

331673 W72366 Hs.40033 ESTs 0.14 

20 316807 AI018331 Hs.172444 ESTs; Highly similar to transcription regulator [MjrtuscuJus] 0.14 

310743 AW449754 Hs.158665 ESTs 0.14 

326941 CH.21Jtsgi|60Q4446 0.14 

328809 CH.07Jtsgi|5868327 0.14 

323855 AI653164 Hs.128665 ESTs 0.14 

25 304705 AA564064 EST singleton (not in UniGene) with exon hit 0.14 

325666 CH.14_hsgi|6469822 0.14 

333747 CH22_FGENES.265jB 0.14 

318287 AW015616 Hs.143321 ESTs 0.141 

332972 CH2£_FGENES.51_5 0.141 

30 305704 AA825266 EST singleton (not in UniGene) with exon hit 0.141 

315699 AW182805 Hs.189183 ESTs; Weakly similar to Nodi [H^apiens] 0.141 

327296 CH.01_hsgi|5867492 0.141 

336400 CH22J=GENES.823_15 0.141 

321033 H26214 Hs^0733 ESTs; Weakly similar to HI! ALU SUBFAMILY SX 

35 WARNING ENTRY 0.141 

316522 AI475995 Hs.122910 ESTs 0.141 

335715 CH22LFGENES.599J5 0.141 

335959 CH22_FGENES.650_2 0.141 

333259 CH22_FGENES.118_7 0.141 

40 337382 CH22_FGENES.744-8 0.141 

322346 AA227618 Hs.10882 HMG-box containing protein 1 0.141 

325378 CH.12_hsgi(5866920 0.141 

338500 CH22_EM:AC005500.GENSCAN.390-1 0.141 

338460 CH22 EM^C005500.GENSCAN.362-5 0.141 

45 315279 AW511138 Hil256581 ESTs 0.141 

314439 AI539443 Hs.137447 ESTs 0.141 

333824 CH22J=GENES.222_3 0.141 

329237 CHJLhsgi|5868729 0.141 

330117 CH.19_p2gi|6015201 0.141 

50 338017 CH22_EM:AC005500.GENSCAN.134-1 0.141 

337854 CH22_EMAC005500.GENSCAN.38-12 0.142 

329984 CH.16j>2gi|4646193 0.142 

305004 AA622328 Hs.162762 EST * 0.142 

302815 N40373 EST cluster (not in UniGene) with exon hit 0.142 

55 327823 CH.05_h$gi|5867968 0.142 

326753 CH.20_hsgi|5867616 0.142 

301201 AA904482 Hs.197775 ESTs 0.142 

334303 CH22_FGENES.373J 0.142 

326453 CH.19_hsgi|5867399 0.142 

60 311050 AI864581 Hs.215477 ESTs 0.142 

308740 AI802711 Hs.210337 EST; Weakly similar to aldolase A [H^apiens] 0.142 

331003 H63959 Hs.142722 ESTs 0.142 

338010 CH22_EM:AC005500.GENSCAN.128-8 0.142 

336326 CH22.JGENES.812J 0.142 

65 318100 R44308 Hs242302 ESTs 0.142 

320641 R55421 EST cluster (not In UniGene) 0.142 

325855 CH.16_hsgl|5867067 0.142 

330425 HG1728-HT1734 Non-Specific Cross Reacting Antigen (Gb:D90277), 

Aft. Splice Form 2 0.142 
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324583 AA425411 Hs.22581 ESTs 0.142 

326268 CH.17_hsgi|5867267 0.142 

331390 AA460341 Hs.45008 ESTs 0.142 

338904 CH22J)J32I10.GENSCAN.10-16 0.143 

5 333096 CH22J=GENES.79J 0.143 

331919 AA446869 Hs.1 19316 ESTs 0.143 

312214 AI248004 Hs.125187 ESTs 0.143 

323198 AW179174 Hs.7984 ESTs 0.143 

316107 A1204001 Hs.184014 ribosomal protein L31 0.143 

10 301335 AA885317 Hs.190511 ESTs 0.143 

337392 CH22J=GENES.747-3 0.143 

325543 CH.12_hsgi|6682452 0.143 

305903 AA873085 EST singleton (not in UniGene) with exon hit 0.143 

332707 L35594 Hs.174185 phosphodiesterase l/nucleotide pyrophosphatase 2 (autotaxin) 0.143 

15 337913 CH22_EMAC005500.GENSCAN.59-10 0.143 

301436 AA961061 Hs.131696 ESTs 0.143 

335078 CH22_FGENES.486_5 0.143 

338451 CH2^EM:AC005500.GENSCAN.359-39 0.143 

302777 AJ230640 EST cluster (not in UniGene) with exon hit 0.143 

20 330464 J03068 Hs.78223 N-acylaminoacyl-peptide hydrolase 0.143 

330988 H41411 Hs.33855 ESTs 0.143 

328939 CH.08_hsgf|6004481 0.143^ 

308015 AI440174 Hs.228907 EST; Weakly similar to GUANINE NUCLEOTIDE-BINDING 
PROTEIN BETASUBUNIT-LIKE PROTEIN 

25 12.3[H.sapiens] 0.143 

328504 CH.07_hsgi|586B471 0.143 

332599 AA402891 Hs.32951 solute carrier family 29 (nucleoside transporters); member 2 0.143 

335744 CH22.FGENES.601J5 0.143 

322394 AF077208 EST cluster (not In UniGene) 0.143 

30 323892 AL042661 EST cluster (not in UniGene) 0.143 

318443 A1939323 Hs.157714 ESTs; Weakly similar to NEURONAL ACETYLCHOLINE 
RECEPTOR PROTEIN; ALPHA-5 CHAIN PRECURSOR 

[H.sapiens] 0.143 

336568 CH2^FGENES.843_7 0.143 

35 330958 H08815 Hs.159824 EST 0.143 

327672 CH.04Jisgi|5867843 0.143 

335900 CH22_FGENES.635_8 0.144 

336044 CH22J=GENES.679_6 0.144 

318845 AI815951 Hs.33183 ESTs; Weakly similar to estrogen-responsive finger protein; 

40 efp[Rsapiens] 0.144 

333483 CH22_FGENES.165_2 0.144 

333337 CH2?_FGENES.139_6 0.144 

305993 AA889197 EST singleton (not in UniGene) with exon hit 0.144 

335719 CH22_FGENES399_22 0.144 

45 325682 CH.14_hsgij6138923 0.144 

327350 CH.01 hsgi|6249563 0.144 

339291 CH22_BA354I12.GENSCAN.18-1 0.144 

326358 CH.18_hsgi|5867293 0.144 

330316 CH.08_p2 gi|6007576 0.144 

50 308150 AI499346 Hs.174131 ribosomal protein L6 0.144 

338065 CH22 EMAC005500.GENSCAN.164-1 0.144 

339009 CH22_DA59H18.GENSCAN.18-7 0.144 

327776 CH.05_hsgi|5867964 - 0.145 

336664 CH22_FGENES.41-8 0.145 

55 321921 AF070619 EST cluster (not in UniGene) 0.145 

319346 T70147 Hs.12024 ESTs 0.145 

304265 AA062892 EST singleton (not in UniGene) with exon hit 0.145 

303818 Z45986 Hs250178 copmell 0.145 

327498 CH.02.hs gi|6017023 0.145 

60 335227 CH22.FGENES513J3 0.145 

339022 CH22_DA59H18.GENSCAN22-1 0.145 

302597 H55661 Hs.33026 ESTs; Weakly similar to similar to Enterococcus faecalis 

TRAB[Oelegans] 0.145 

308550 AI697008 Hs.201811 EST 0.145 

65 302175 AA262760 Hs.156015 Homo sapiens chromosome 19; cosmidR29381 0.145 

303252 AA156760 EST duster (not In UniGene) with exon hit 0.145 

337414 CH22_FGENES.757-2 0.145 

310382 AI734009 EST cluster (not in UniGene) 0.145 

329333 CHJLhsgil5868806 0.145 
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336857 
332565 
318634 
336318 
310960 
335346 
331196 
337607 
331206 
301793 
319590 
311394 
324773 
324841 
332260 
329276 
335887 
338294 



AA234896 
AI928098 

A1923551 

T65416 

T84096 

T80698 

AA210878 

AI695374 

AA632554 

AI142359 

N70088 



334135 
326251 
337396 
339167 

316838 AW135418 
325313 

331047 N66918 
323915 AL043362 
302747 AF062275 
306317 AA947909 



30 334399 



40 338238 



326472 
333061 
337072 
334328 
327039 
325576 

315935 AI075804 
319638 AA323758 
334501 



A1148477 
AW504854 



AA332011 

AA333068 
AA385315 



308636 AI744063 
336567 
335819 
336950 
307055 
315134 
335834 
327870 
323802 
329412 
323791 
324126 
327865 
333445 
321302 
336744 
323731 
320289 
305488 
305592 
304094 
325040 
339034 
334504 
334778 
320148 
303584 
325826 
331192 



AA021351 Hs.158497 



AA323414 
H07989 
M749000 
AA780594 
H11295 
AW296368 



U77494 
AW173759 

T55182 



CH22.FGENES.291-7 0.145 

Hs.25272 E1A binding protein p300 0.145 

Hs.156832 ESTs 0.145 

CH22J=GENES.801J 0.145 

Hs.170843 ESTs 0.145 

CH22_FGENES.537J> 0.145 

Hs.12826 ESTs 0.145 

CH22_C20H12.GENSCAN.17-3 0.146 

Hs.15284 ESTs 0.146 

EST cluster (not in UniGene) with exon hit 0.146 

EST cluster (not in UniGene) 0.146 

Hs.256231 ESTs 0.146 

Hs,163401 ESTs 0.146 

Hs.155316 ESTs 0.146 

Hs.138467 ESTs 0.146 

CHX_hsgi|5868762 0.146 

CH22_FGENES.633_1 0.146 

CH22_EM:AC005500.GENSCAN^97-1 0.146 

CH22J=GENES.409-4 0.146 

CH22_FGENES.336J2 0.146 

CH.17Jisgi|5867263 0.146 

CH22.FGENES.749-1 0.146 

CH22J>A59H18.GENSCAN.69-8 0.146 

Hs.161210 ESTs 0.146 

CH.11_hsgi|5866865 0.146 

Hs.32205 ESTs 0.146 

EST cluster (not in UniGene) 0.146 

EST cluster (not in UniGene) with exon hit 0.146 

EST singleton (not in UniGene) with exon hit 0.1 46 

CH22 FGENES.382J5 0.146 

CH.19.hsgll5867404 0.146 

CH22_FGENES.75_4 0.146 

CH22_FGENES.448-5 0.146 

CH22_FGENES.375_5 0.146 

CH2Lhsgii6531965 0.146 

CH.12Jisgi|6552443 0.147 

Hs.132660 ESTs 0.147 

EST cluster (not in UniGene) 0.147 

CH22_FGENES.397J7 0.147 

CH22_EMAC005500.GENSCAN2644 0.147 

EST singleton (not in UniGene) with exon hit 0.147 

CH22_FGENES,843_6 0.147 

CH22_FGENES.619_2 0.147 

CH2£_FGENES.361-8 0.147 

EST singleton (not in UniGene) with exon hit 0.147 

Hs.126714 ESTs 0.147 

CH2£FGENES.621J 0.147 

CH.06_hsgi|5868131 0.147 
Hs.250138 protein phosphatase 2C; magnesium-dependent; catalytic subunit 0.147 

CH.XJisgi|6682553 0.147 

EST cluster (not in UniGene) 0.147 

EST cluster (not in UniGene) 0.147 

CH.06_hsgl|5868130 * 0.147 

CH22_FGENES.154 J. 0.147 

KIAA0724 gene product 0.147 

CH22_FGENES.118-9 0.147 

EST cluster (not in UniGene) 0.148 

EST cluster (not in UniGene) 0.148 

EST singleton (not in UniGene) with exon hit 0.148 

Hs.62954 ferritin; heavy polypeptide 1 0.148 

EST singleton (not in UniGene) with exon hit 0.148 

. EST cluster (not in UniGene) 0.148 

CH22JDA59H18.GENSCAN.2fr2 0.148 

CH22_FGENES.398_2 0.148 

CH22J=GENES.431_2 0.148 

Hs.1 19687 RAN binding protein 8 0.148 

Hs.203401 ESTs 0.148 

CH.15Jisgil5867048 0.148 
Hs.152571 ESTs; Highly similar to IGRI mflNA-binding protein 2 [H.saplens] 0.148 
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325785 CH.10sgil6381957 0.148 

333166 CH22_FGENES.91._8 0.148 

336548 CH22_FGENES.841_5 0.148 

337552 CH22_C4G1.GENSCAN.1-4 0.148 

5 331775 AA382742 Hs.97151 EST 0.148 

338936 CH22_DJ32l10.GENSCAN.19-6 0.148 

331869 AA428554 Hs.104894 ESTs; Weakly similar to fibronectln precursor [Rsapiens] 0.148 

332865 CH22.FGENES.28_5 0.148 

328663 CH.07_hs gi|6004473 0.148 

10 328436 CH.07_hsgi|5868417 0.148 

311158 A1634864 Hs250789 ESTs; Highly similar to similar to NEDD-4 [H.sapiens] 0.148 

336942 CH22.FGENES.354-2 0.148 

302262 R53169 Hs.246091 ESTs 0.149 

333296 CH22_FGENES.132_3 0.149 

1 5 333365 CH22_FGENES.142_2 0.149 

311706 AW452392 Hs.252854 ESTs 0.149 

337109 CH22_FGBJES.489-2 0.149 

315062 AW173300 Hs.190201 ESTs 0.149 

333454 CH22 FGENES.157_3 0.149 

20 334784 CH22_FGBJES.432_9 0.149 

333255 CH22_FGENES.118_3 0.149 

337518 CH22.FGENES.814-7 0.149 _ 

320651 AA489268 EST duster (not in UniGene) 0.149 ~* 

323437 AA287567 EST cluster (not in UniGene) 0.149 

25 328761 CH.07_hsgi|5868302 0.149 

328787 CH.07_hsgi|5868309 0.149 

335261 CH22_FGENES£20_2 0.149 

300827 R16689 Hs.106004 ESTs 0.149 

339263 CH22_BA354I12.GENSCAN.10-1 0.149 

30 337412 CH22_FGENES.756^ 0.149 

334414 CH22_FGENES.384_1 0.149 

332931 CH22_FGENES.38_5 0.149 

310801 AW270980 Hs.106346 novel centrosomal protein RanBPM 0.149 

305216 AA669056 EST singleton (not in UniGene) with exon hit 0.149 

35 314779 AA470122 Hs.190261 ESTs 0.149 

338414 CH22_EM:AC0055Q0.GENSCAN.341-27 0.149 

303342 AW247361 EST cluster (not in UniGene) with exon hit 0.149 

337509 CH22_FGENES.806-4 0.149 

306631 AI001149 EST singleton (not In UniGene) with exon hit 0.149 

40 302533 L36149 Hs.248116 chemokine (C motif) XC receptor 1 0.149 

336536 CH22_FGENES.839J8 0.149 

324666 T32458 Hs.14285 ESTs 0.149 

310173 AI767433 Hs.170013 ESTs 0.149 

333595 CH22_FGENES.211.J2 0.149 

45 335975 CH22_FGENES.652_9 0.15 

306654 AI003654 EST singleton (not in UniGene) with exon hit 0.15 

335025 CH22_FGENES.475_3 0.15 

32871 1 CH.07.hs gi|5868271 0.15 

328274 CH.07_hsgi|5868219 0.15 

50 325505 CH.12_hsgi|6682451 0.15 

329641 CH.14_p2gi]6468233 0.15 

304955 AA613504 EST singleton (not In UniGene) with exon hit 0.15 

339103 CH22_DA59H18.GENSCAN.44-10 - 0.15 

329636 CH.12_p2gi|5302817 0.15 

55 310118 AI203293 Hs.157489 ESTs 0.15 

326056 CH.17_hsgi|5867184 0.15 

303773 AA769074 EST cluster (not in UniGene) with exon hit 0.15 

303153 U09759 Hs.8325 mitogen-activated protein kinase 9 0.15 
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TABLE 1 3A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 13. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
5 Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 
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CAT number 
Accession: 



Unique Eos probeset Identifier number 
Gene cluster number 
Genbank accession numbers 



Pkey CAT number Accession 



322050 24275 J 
321439 1599424.1 
321666 13653.22 



300088 622937J 
322303 704603 J 
322394 27492.1 



321758 44275J 
323109 155498.1 



322533 38937J 
321921 34680.1 
321927 21620J 



321932 265316J 
306971 14694.7 



AL1 37589 AA423949 BE222949 BE222694 AI1 99615 AW8731 16 AI277950 AW044290 AW630096 
H61962 W01567N75711 

BE259906 AA232518 AA013359 AL035788 AW160822 BE387134 BE002954 BE391839 AW161565 A1878841 BE616458 
BE409981 BE387308 BE297436 BE315536 AA206924 R12012 AA214169 BE312812 BE387093 H11710 BE312009 
BE260569 AA343566 AA219526 R34757 AA219749 BE336733 AA219751 AW411099 AA232408 BE018716 BE398089 
AA206253 AA053487 M1 14224 AV655868 AW732566 BE394087 AW732574 AA313442 BE336875 M070548 BE259840 
BE019828 AW732341 AA299916 BE019253 BE018238 BE387109 AA232304 BE255589 AW732585 M181436 AA308777 
AA075802 AW732521 AA314526 AA226747 BE409513 AA206168 BE388292 BE298782 BE387086 AA305310 AV652723 
AA314918 BE615510 AW951763 BE398104 BE385195 BE407165 BE391336 BE390187 BE389189 BE540650 BE249884 
BE385985 BE274245 BE391124 BE260080 AA182600 BE512821 BE390090 BE279398 BE279589 BE263454 BE515194 
BE293569 BE272531 BE388814 BE384659 BE271685 BE561043 BE278449 BE302572 AW239076 AI750583 AA376179 
AA1 12632 BE266324 BE266614 R13105 AA132286 BE296305 A1220355 AA205606 AA219527 AA219519 AW804310 
AA083286 BE171208 T19693 AA338328 BE185868 AA903024 T92162 AA3301 19 BE410404 BE314668 
AW576245 BE207878 AW299993 AI199558 AI285442 AW299994 AW394242 AW394184 
At357412 AI870708 A1590539 W07459 

AW068287 AA310079 BE336702 AA356318 AA306059 AA346785 AW402633 AA31 1210 AW402909 N76879 AW402913 
AW401920 AA321636 AA354474 C17297 C16938 AA311774 M29871 NM.002872 282188 AW405674 H94176 R89281 
AA214723 AI014482 AW949347 T27749 AW804226 AW796964 AW404581 AF077208 NM.014Q29 W68830 W79652 
AA353375 AW575218 AA552192 AA521232 AA702695 AA033975 AW407827 AA829948 N944Q2 AW628604 A1523308 
N57605 AA641662 H42477 N52784 A1753478 AA768493 AA845729 W47391 N55270 AI090117 R89282 BE206172 
AA076650 AA595650 AI218931 BE049397AI433110W74114H94277A1358627 AI085221 AI862818 AA835967 AW103905 
AI640644 AA835507 AA856887 AA694392 AW337542 AI524410 BE045500 AI440060 AI358801 AW028238 AW205248 
AI718264 R48618 AA357358 AI695002 AA897549 AW081065 AI433360 AI810783 A1620963Z82188 AA36Q224 
U291 12 AI656540 A1364875 A1656246 AI99094O 

AA169345 AI762857 AI949997 AI809601 AI681948 AI221079 AW167404 A1347614 AI61 1090 AI023472 AI347683 AI027467 
AW591788 AI380665 AA835735 AA836654 A1244028 AW193159 AI5001 12 AI918722 AI738693 Ar702308 AA805365 
AI766842 

T59538 T59589 759598 T59542 AF147374 
AF070619 R20302 TB0358 

AJ223366 BE305086 AW820106 AA621983 BE305208 AI738475 AI380189 AW590847 AI127232 AA622706 AI380858 
AA621975 A1587036 AA665743 AW204003 AI692234 AI002242.AI692219 AW137282 AW268783 AW295910 AJ308015 
AW301462 AI318288AI318575 AI318117 AI345591 A1249650 AI246934 A1246864 A1246971 AW268311 AI249654BE041907 
AW732776 

N72324 N52825 W19526 BE143464 AA376060 

M83667 NM.005195 S63168 M83667 AW068039 AW630649 AI338577 A1018125 AI269878 AW242440 A1887823 AI342581 
BE222416 AI582847 AI651011 A1660815 AI699574 BE550201 AI926996 AW665855 AI827752 AI761857BE328168 
BE222451 AI762201 AW0Q0929 AW007207 BE042962 BE551843 BE465373 AI279179 AI949945 BE551862 AW051667 
BE328076 BE222296 AW007229 AW772332 AI279801 AI934526 AI631938 AI770103 BE041412 AI417900 AI692655 
AI869943 AW270119 AI431739 A1703347 AW770568 AW025473 AI701497 All 28026 BE328147 AW203980 BE046793 
AW087704 AI674597 A1650732 AI813691 AI472092 AI695224 AI241217 AW207746 AI206840 AI271362 AI631788 AI911883 
AI914619 AJ38Q585 AI767501 AI823759 AI564116 AI190991 AI377369 AI814122 AI221623 AI354793 A1081988 A/391740 
A1337435 BE467366 A1824347 A1565325 AI280038 A1640455 A1819744 BE467803 BE327524 A1149402 AI313187 BE219684 
AW611948 AW665821 AI091260 AW044492 BE220366 AW025381 AW1 83264 A1694865 AI498474AI129780 A1202028 
AI566792 BE220659 AI928040 AI830696 AW93021 AW612488 AJ913152 BE042965 A1631837 AI693873 AI498925 AI768668 
AM01544 BE327023 A1693383 AI76S874 AI744003 AW082273 AI686501 AI798177 AI985196 AI090033 AI432342 A1689918 
Al 638308 BE468080 BE219588 AI9121 19 BE219787 AW005392 BE326564 A1589039 AI860187 AI758143 AI338168 
AI702936 BE221985 AI498727 A1918196 AI279735 AW771497 A1860133 AW237834 AW661759 AW0281 1 1 BE503416 
AI360180AW811715AI871777BE045447BE326444 AI266547 A1800237 AI823315 AI478368 AI264281 AI675841 AI690041 
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301119 33384J 



324019 262792J 

323437 189513J 

307845 19804J0 

324126 272259J 

309101 7570J 



45 


315703 119175J 


50 


301373 368214 1 




323665 54093 J 




323676 220254J 




302086 23306 J 


55 


323731 226193J 


323791 232336J 




325040 23854.1 




324430 312113J 


60 


323892 477253J 


309488 1030131J 




302251 27216.4 




302286 22717^6 




323915 110063J 


65 


324594 330528J 


301737 65_1 



AI498018 A1554124 AI239893 AI864054 AI280099 AI192815 A1620465 AI080201 AW002057 BE500986 AI341 131 AI81 8991 
AI566137 AI1 23403 BE219192 AW183844 AI499842 AW137971 AW1 38720 AW015526 AW138160 AW243163 AW138705 
AW139927 AW140006 AW138810 AW1 37450 AW206970 AW135419 AW205974 AA043494 BE465106 AW1 39955 AI741 112 
BE326942 AA043506 A1079957 A1942432 AI392902 AI097047 AI470599 AA514553 AA984008 N47949 A16541 14 AA884832 
AI796752 AI765290 AI301 155 AW470358 BE222764 AI823569 AI651 188 AI692695 AI476643 BE504307 AI767573 BE219719 
AI932249 AW467075 AI913633 BE221966 A1091025 AA969215 AI799810 AA931 170 BE048559 AI809606 A1138614 
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A1434259 R49181 T58717 AW062486 AW796966 AI648384 R77733 AI623502 BE171342 BE171303 R35658 AW974883 
AW149898 AI500045 A1540710 A1540392 AW009172 AW277199 AI371312 A1500096 AI470297 AW372940 AW844562 
AW844560 AW797965 AI691 146 X07062 AW799199 H60666 AA837684 AF130734 T25952 A1933771 A1914860 AW391925 
AW793843 AW795012 AW366709 AW750987 AW750985 R35765 AW844942 AW750986 H64920 R34651 X86703 
BE018103 BE018083 BE293253 AW247083 BE207643 BE514793 BE183238 AA376427 AW273850 AW043786 BE439973 
AL045428 AI889050 AA026496 AI422924 AI884485 W96068 AAQ20872 F371 19 AA714378 AA021 107 AA01 1 141 AI554001 
AI375841 AI469097 AA335219 AW967315 AI692177 AA410448 AI568858 AA582647 AA026419 AA281639 AW515248 
AW007777 AA010840 AW188439 A1805423 AI148210 BE301590 AA744414 AA745392 AW167423 AA622659 AW000878 
AI432387 AA760930 BE047189 AA021605 AV658045 AI093347 AA588594 H63143 AA639556 AI308976 AA379270 
AA633407 AI874329 AI206484 AI493895 AI694103 AI249682 AA973765 AA872445 AI125446 AA287272 AW069761 
AA682569 AW009712 BE542774 R50167 BE301574 AA991202 AA502006 A1219819 AW074373 AA617996 AI521242 F25241 
AW615812 R16774 AA335218 AW673800 H26778 A1468557 AI886986 AI560759 AI460075 AA502968 AA503273 AA610680 
AA287274 AA554020 AA284889 AA916636 AW469457 AW273250 AW673708 AW512948 AL041071 AI446042 AA903535 
BE172441 AI282411 AW265021 AA810799 AI559865 AA729332 AW004611 AW129451 AA659019 BE208239 AA610825 
H03511 BE383995 R16474 AA281701 AW009244 AA287424 AA558139 AW364081 

F08147 AW408359 AW949429 R23785 AW247442 AA305512 T29095 AA905130 BE246361 BE244981 AA220199 BE504058 
X80878 AA533727 M608601 AW005964 AI811627 A1367037 AI277985 AW93719 AE77848 AA854982 AW247298 AI216345 
AI041295 AI887378 AA781241 AI674270 AW628959 AI383083 BE504391 AA729421 AA552188 AA373387 AW880360 
AW875262 AW875369 AW581540 AW875358 AW581568 R23735 AW134768 
W03912 AW971410 AA506385 AA209530 H73495 H48629 W56149 
H56752AW340384N49521 

AA853680 AK001668 BE386425 BE563549 BE296124 BE298950 R51419 U46295 BE147292 AA360056 R48018 AW845348 
N47383 AI817280 AI671902 AA988104 AM79464 N56996 AI192374 AI927558 AA659888 AI799903 AA548397 AI161 167 
AI656333 AI418829 AW592671 BE327906 AW513346 AI888579 AW469410 AW512809 D25682 AA576079 AA479354 
T30342 R51307 T16044 H29063 AW079357 AI339477 R47914 AI986068 AI870065 AI868489 AI521099 A1582732 AA995540 
AW957299 AA352608 AA676752 AM10510 AA358874 A1865724 AA853679 A1699265 AW188789 N47380 AA23371 5 
BE258194 R55421 R55643 H42362 AA243884 

AW886407 AA489268 R57015 R58094 BB077459 BE077423 BE546995 AW849216 T69383 AW9381 1 1 H60337 BE221073 
AB033100 AA347036 BE260325 AW961669 AL047207 AA347037 AI766894 AA601045 AI559897 AW139033 AW274622 
AW172884 AW089070 AA804340 AW798925 
AA825266 

AL1 37354 AL043375 
AA971985 
AA977992 
AA989542 



AA989713 
AA991487 
AI00Q246 
A1000248 
AI001149 
AI003654 
AI041589 
AI051696 
AI452732 
AI470948 
AJ475914 
AI055966 
AI066577 



310570 1071946 J 
305022 
305060 . 
305070 



AI095365 
AI127683 
AI559492 
AI565612 
AI571211 
AI581855 
AI591235 
AI687580 
AI719930 
AI735634 
AI744063 
AI819263 
AI829820 
A1873242 

A1318327 AI318328 AI318495 

AA627416 

AA635771 

AA639783 
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305079 
305134 
303977 
305216 
305263 
305266 
305396 
305403 
305488 
305549 
305601 
305610 
305621 
305710 
305724 
305744 
305752 
307018 
307055 
307058 
305801 
305830 
305836 
305852 
305858 
305866 
305867 
307126 
305903 



AA641329 

AA653159 

AW512978 

AA669056 

AA679467 

AA679772 

AA721052 

AA723748 

AA749000 

AA773530 

AA760975 

AA782319 

AA789095 

AA826544 

AA827608 

AA831819 

AA835278 

A1140639 

A1148477 

A1148709 

AA845997 

AA857665 

AA858043 

AA862455 

AA863103 

AA864533 

AA864572 

A1184951 

AA873085 



328803 c_7_hs 
328809 c_7Jis 
305949 AA884409 
328829 c_7_hs 
330021 c16_p2 
330024 c16_p2 
330028 c16j)2 
330049 c17_p2 
305993 AA889197 

330095 Cl9_p2 

330096 c19_p2 

307205 AI192479 
307427 AI243437 
307491 AI268539 
307581 AI284415 
307588 A1285535 
337672 CH22_6O02FGL_UNK_EMAC00 
337693 CH22_6O30FG_LINK_EMACO0 
337738 CH22.6083FGL_MNK_EMAC00 
307692 AI318342 
307806 A1351739 
309107 A1925823 
309230 AI970747 
339338 CH22_83Q0FG_JJNK.BA354M 
309257 AI984183 
309366 AW072970 
309422 AW087175 
325207 clOJw 
325257 clljis 

309646 AW194694 
309651 AW1 95850 

325313 Cl1_hs 

309924 AW340812 
334030 CH22J308FGL320_2_UNK^EM 
334040 CH22_1318FGL322_8_LINK_EM 
334083 CH22J361FGL327_38J.INK_E 
332810 CH22J6FQ_7J2_LINK_C65E1 
302747 32813 1 AF062275 L03830 

302753 33029J M74299 M74302 M74303 

302777 33803 J AJ230640 AJ230648 
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304094 
302024 



304240 
304410 
304443 
304475 
304522 
304878 
304705 
306004 
306008 
306013 
306082 
336174 
306094 
304823 
304872 
304918 
304955 
306249 



35372J 
41196J 
c16_hs 



H11295 

U21260U21258 
AF054663AF124197 R70292 

AA009802 
AA284508 
AA399444 
AA428879 
AM65405 
AA548556 
AA564064 



AA894390 
AA8969S0 
AA908508 
CH22.3567FQ_710J.UNK.DA 
AA908877 
AA584837 
AA595289 
AA602697 
AA613504 
AA933840 
AA936892 
306295 AA937331 
306317 AA947909 
306347 AA961144 
AA962086 
AA970548 

entre2L_D28383 
460L2 



330401 
330463 



330535 
332634 



1374.-8 
10404_2 



NM_001055 AA332948 U26309 U09031 L19955 L10819 A1366043 X84654 U71086 AV654451 AJ007418 AA053625 
BE168856 AA376730 H12694 AA81C348 AA621972 AI818950 AV645367 AI819966 AA910602 AW512449 H67893 AI310497 
AI304330 AI339217 AW1935B8 AW438688 AI81 8970 AW316799 AA906527 AA777570 N47673 AI336428 AW945133 
A1038606 R29692 AW194197 AI304748 H12639 AA053178 AA493213 AA676958 AA1 13154 AI313469 AI368239 R93183 
W24532 U52852 U54701 AL046864 AA365795 
U11872 

U24488NM.007116 
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TABLE 13B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 13- For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332791 Dunham, LetaL 

332792 Dunham, i. etaL 
332810 Dunham, LetaL 
332944 Dunham, I. etaL 
332972 Dunham, I. etaL 
333133 Dunham, I. etal. 

333154 Dunham, I. etaL 

333155 Dunham, I. etaL 
333227 Dunham, I. etaL 
333230 Dunham, I. etal. 
333298 Dunham, I. etal. 

333304 Dunham, I. etaL 

333305 Dunham, I. etaL 
333365 Dunham, I. etaL 
333383 Dunham, LetaL 

333391 Dunham, I. etal 

333392 Dunham, LetaL 
333397 Dunham, I. etaL 
333403 Dunham, I. etaL 
333413 Dunham,!. etal. 
333445 Dunham, I. etaL 
333479 Dunham,!. etal. 
333481 Dunham, I. etaL 
333483 Dunham, LetaL 

333516 Dunham, LetaL 

333517 Dunham, LetaL 

333518 Dunham, LetaL 
333531 Dunham, LetaL 
333566 Dunham, LetaL 
333572 Dunham, LetaL 
333586 Dunham, LetaL 
333588 Dunham, LetaL 

333594 Dunham, LetaL 

333595 Dunham, I. etaL 

333600 Dunham, LetaL 

333601 Dunham, I. etal. 
333607 Dunham, I. etal. 

333612 Dunham, LetaL 

333613 Dunham, LetaL 

333614 Dunham, I. etaL 
333624 Dunham, LetaL 
333626 Dunham, LetaL 
333635 Dunham, LetaL 
333637 Dunham,!. etal. 
333642 Dunham, LetaL 
333647 Dunham, I. etaL 

333653 Dunham, I. etal. 

333654 Dunham, I. etal. 

333656 Dunham, LetaL 

333657 Dunham, I. etaL 

333658 Dunham, LetaL 



Strand 


NLposition 


Plus 


72720-73315 


Plus 


73381-73768 


Plus 


304296-304384 


Plus 


2414825-2414932 


Plus 


2572152-2572236 


Plus 


3360058-3360195 


Plus 


3615887-3616019 


Plus 


3616832-3617003 


Plus 


3992866-3992968 


Plus 


3995507-3996507 


Plus 


45815374581947 


Plus 


46299434630242 


Plus 


46303884630645 


Plus 


47868834787283 


Plus 


49071794907277 


Plus 


49166974916780 


Plus 


49182944918433 


Plus 


49224664922635 


Plus 


49251404825256 


Pius 


49438244943974 


Pius 


5097827-5097885 


Plus 


5272855-5272939 


Plus 


5286358-5286505 


Plus 


5297945-5298105 


Plus 


5570204-5570390 


Plus 


5570729-5570925 


Plus 


6571761-5572025 


Plus 


5622622-5622684 


Plus 


5954226-5954473 


Plus 


6026896-6027189 


Plus 


6246834-6247314 


Plus 


6255445-6255779 


Plus 


6308990-6309450 


Pius 


6323103-6323348 


Plus 


6355629-6355925 


Plus 


6360075-6360442 


Pius 


6504431-6504690 


Pius 


654956^6549697 


Plus 


655064^6550748 


Pius 


6551227-6551389 


Plus 


6595146^595244 


Plus 


6614174-6614467 


Plus 


6663683-6663973 


Plus 


6674968-6675134 


Plus 


6708760-6709139 


Plus 


6772502-6772779 


Plus 


6811130-6811392 


Plus 


6816731-6816993 


Plus 


6822087-6822406 


Plus 


6831369-6831445 


Plus 


6835282-6835474 
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333659 Dunham, I. etal Plus 
3336B4 Dunham, I. etaL Plus 
333666 Dunham, I. etal. Plus 

333697 Dunham, I. etal. Pius 

333698 Dunham, I. etal. Plus 

333699 Dunham,!. etal. Plus 
333703 Dunham, I. etal. Pius 
333709 Dunham, I. etal. Plus 
333747 Dunham, I. etal. Plus 

333774 Dunham, I. etal. Plus 

333775 Dunham, I. etal. Plus 
333806 Dunham, I. etal. Pius 
333843 Dunham, I. etal Pius 
333854 Dunham, I. etaL Pius 
333873 Dunham, I. etal. Plus 
333880 Dunham, I. etal. Pius 
333885 Dunham, I. etal Plus 
333918 Dunham, I. etal Pius 
333947 Dunham, I. etaL Plus 
333961 Dunham, t etal Plus 
333981 Dunham, I etal Plus 
333991 Dunham, I. etaL Pius 
333994 Dunham, I. etaL Plus 
334030 Dunham, I. etaL Pius 
334083 Dunham, I. etal Plus 
334111 Dunham, I. etaL Plus 
334135 Dunham, I. etaL Pius 
334218 Dunham, I. etaL Plus 
334249 Dunham, I. etal Plus 
334262 Dunham, I. etaL Plus 
334264 Dunham, I. etal Plus 

334327 Dunham, I. etal. Plus 

334328 Dunham, I. etaL Plus 
334340 Dunham, I. etal Plus 
334454 Dunham, I. etal. Plus 
334504 Dunham, I. etal Plus 
334508 Dunham, I. etal. Pius 
334512 Dunham, I. etal Plus 
334582 Dunham,!. etaL Plus 
334659 Dunham, I. etal Plus 
334721 Dunham, I etal Plus 
334723 Dunham, I. etal. Plus 
334730 Dunham, I. etal. Plus 
334774 Dunham, L etal Pius 
334778 Dunham, I. etal. Pius 
334851 Dunham, I. etaL Plus 
334885 Dunham, I. etaL Plus 
334902 Dunham,!. etal. Plus 

334905 Dunham,!. etal Plus 

334906 Dunham, I. etal Plus 
334910 Dunham, L etal. Pius 
335018 Dunham, I. etal. Plus 
335025 Dunham,!. etal. Plus 
335033 Dunham, !. etal. Pius 
335044 Dunham,!. etal. Plus 
335142 Dunham, I. etal. Plus 
335157 Dunham, L etal. Plus 
335160 Dunham, L etal Plus 
335174 Dunham, I. etal. Plus 
335188 Dunham, I. etal. Plus 

335190 Dunham, 1. etal. Plus 

335191 Dunham, I. etal. Plus 
335193 Dunham, I. etal. Plus 
335204 Dunham, 1. etal Plus 
335222 Dunham, I. etal. Plus 

335226 Dunham, I. etal Plus 

335227 Dunham, I. etal. Pius 

335309 Dunham,!. etal. Plus 

335310 Dunham, I. etal Plus 



6836179-6836248 

7169561-7169742 

7177117-7177302 

72038597203934 

7205279-7205383 

7206101-7206175 

7215559-7215663 

7229730-7229835 

7605884-7606206 

7716509-7716636 

7729983-7730149 

7877475-7877666 

7978762-7978887 

8029446-8029524 

8133266-8133429 

8151923-8152133 

8154352-8154437 

8307124-8307215 

8579888-8579966 

8617999-8618104 

8782374-8782643 

8837419-8837551 

8852749-8852894 

9288463-9288782 

9837016-9837081 

10279365-10279531 

10457085-10457183 

12680289-12680378 

13190430-13190574 

13231452-13231581 

13234447-13234544 

13577413-13577496 

13589868-13589936 

13642407-13642522 

14326506-14326738 

14510206-14510398 

14514936-14515122 

14545933-14546366 

15026255-15026371 

15460624-15460726 

15796816-15796987 

15805317-15805399 

15967830-15967934 

16251857-16252178 

16276180-16276395 

17820110-17820810 

19233667-19233787 

19317083-19317195 

19322553-19322680 

19323493-19323590 

19398155-19398684 

20688288-20688415 

20743941-20744050 

20753188-20753314 

20842088-20842682 

21465105-21465186 

21543302-21544341 

21573388-21573497 

21631301-21631447 

21669118-21669328 

21680807-21680876 

21681110-21681183 

21692208-21692362 

21750636-21750726 

21885542-21885608- 

21890838-21890930 

21892145-21892289 

22500158-22500276 

22500714-22500831 
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335311 Dunham, LeLaL Plus 
335355 Dunham, I. etai. Pius 
335382 Dunham, LetaL Plus 
335368 Dunham, LetaL Plus 

335384 Dunham, LetaL Plus 

335385 Dunham, LetaL Plus 
335436 Dunham, LetaL Plus 

335440 Dunham, LetaL Plus 

335441 Dunham, LetaL Plus 
335450 Dunham, LetaL Plus 
335453 Dunham, LeLaL Plus 
335458 Dunham, LeLaL Plus 
335464 Dunham, LeLaL Plus 

335496 Dunham, LetaL Plus 

335497 Dunham, LetaL Plus 

335498 Dunham, LetaL Plus 

335499 Dunham, LeLaL Plus 

335500 Dunham, LetaL Plus 
335507 Dunham, LetaL Plus 
335510 Dunham, LetaL Plus 
335513 Dunham, I. etal. Plus 
335627 Dunham, LetaL Plus 
335651 Dunham, LetaL Plus 

335655 Dunham, LetaL Plus 

335656 Dunham, LetaL Plus 
335658 Dunham, LetaL Plus 
335663 Dunham, LetaL Plus 
335665 Dunham, LetaL Plus 

335667 Dunham,!. etal. Plus 

335668 Dunham, LetaL Plus 

335689 Dunham, LetaL Plus 

335690 Dunham, LeLaL Pius 
335715 Dunham, LetaL Plus 
335719 Dunham, LeLaL Plus 
335734 Dunham, LetaL Plus 
335744 Dunham, LetaL Plus 
335809 Dunham, LetaL Plus 
335819 Dunham, LeLaL Plus 
335822 Dunham, LetaL Plus 
335872 Dunham, LeLaL Plus 
335885 Dunham, I. etal. Plus 
335968 Dunham, I. etal. Plus 
335971 Dunham, LetaL Plus 

335975 Dunham,!. etal. Plus 

335976 Dunham, LetaL Plus 

335989 Dunham, LetaL Plus 

335990 Dunham, I. etal. Plus 
336010 Dunham, LetaL Plus 
336093 Dunham, LetaL Plus 
336126 Dunham, LetaL Plus 
336129 Dunham, LetaL Plus 
336187 Dunham, LetaL Plus 
336186 Dunham, I. etal. Plus 
336225 Dunham, LetaL Plus 
336371 Dunham, LetaL Plus 
336373 Dunham, LetaL Plus 
336377 Dunham, LetaL Plus 
336380 Dunham, I. etal. Plus 

336383 Dunham. LeLaL Plus 

336384 Dunham. LetaL Plus 

336385 Dunham, LetaL Plus 

336386 Dunham, LetaL Plus 
336441 Dunham, LetaL Plus 
336444 Dunham, I. etal. Plus 
336484 Dunham, LetaL Plus 
336497 Dunham, I. etal. Plus 
336499 Dunham, Letat Plus 
336503 Dunham, LeLaL Plus 
336548 Dunham, LeLaL Plus 



22501602-22501676 
22779222-22779516 
22809167-22809461 
22843040-22843184 
22918150-22918263 
22919072-22919339 
23427793-23427923 
23458702-23459017 
23460632-23460724 
23480190-23480270 
23483333-23483459 
23490034-23490143 
23500331-23500496 
24164386-24164545 
24167666-24167669 
24172082-24172161 
24176698-24176869 
24178236-24178326 
24219973-24220039 
24222975-24223118 
24224272-24224498 
25150005-25150061 
25317560-25317696 
25333211-25333369 
25333601-25333751 
25336315-25336406 
25342680-25342802 
25344096-25344287 
25345735-25345856 
25346313-25346447 
25454350-25454604 
25455442-25455625 
25565941-25566052 
25593936-25594101 
25688723-25688869 
25716483-25716615 
26310772-26310909 
26356341-26356470 
26364087-26364196 
26820760-26820943 
26933436-26933534 
27743843-27744029 
27752808-27753017 
27801321-27801391 
27809041-27809187 
27983788-27983860 
27988532-27988608 
28570239-28570330 
29556922-29557002 
30057891*30058105 
30062259-30062348 
30433494-30433585 
30434870-30435004 
30833614-30833788 
33968108-33968204 
33976308-33976504 
33994489-33994599 
33995323-33995434 
34005784-34005964 
34007429-34007559 
34007879-34008159 
34012965-34013115 
34187606-34187663 
34190585-34190718 
34237425-34237505 
34267190-34267245 
34267504-34267572 
34271306-34271372 
34353881-34354826 
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336552 Dunham, LetaL Plus 

336553 Dunham, I. GUI. Pius 

336567 Dunham J. eta!. Plus 

336568 Dunham, I. etal. Plus 
336659 Dunham, Letat. Pius 
336715 Dunham, I. etal. Plus 
336803 Dunham, I. etal. Plus 
336805 Dunham, I. etat Plus 
336850 Dunham, I. etal. Pius 
336857 Dunham, I. etal. Plus 
336911 Dunham, I. etal Plus 

336949 Dunham, I. etal Pius 

336950 Dunham, I. etal Plus 
336958 Dunham, Letat Plus 
336993 Dunham, L etal Pius 
337076 Dunham,!. etal Pius 
337109 Dunham, Letat Plus 
337123 Dunham, I. etal. Plus 
337151 Dunham, L etal. Plus 
337189 Dunham, Letat Plus 
337241 Dunham, Letat Plus 
337337 Dunham, Letat Plus 
337353 Dunham, I. etal Plus 
337384 Dunham, LetaL Plus 
337396 Dunham, I. etal Plus 
337414 Dunham, LetaL Plus 
337418 Dunham, LetaL Plus 
337461 Dunham, I. etal. Plus 
337480 Dunham, LetaL Plus 

337482 Dunham, L etal. Plus 

337483 Dunham, I. etal. Plus 
337490 Dunham, I. etal. Plus 
337522 Dunham, I. etal Plus 
337532 Dunham, LetaL Plus 
337552 Dunham,!. etal Pius 
337584 Dunham,!. etal Plus 
337611 Dunham, I. etal Plus 
337672 Dunham, L etal. Plus 
337693 Dunham, LetaL Plus 
337738 Dunham, Letat; Plus 

337926 Dunham, Letat. Pius 

337927 Dunham,!. etal Plus 
337935 Dunham, LetaL Plus 
337944 Dunham, LetaL Plus 
337954 Dunham, LetaL Plus 
337996 Dunham, LetaL Pius 
338004 Dunham,!. etal Pius 
338016 Dunham, I. etal Pius 
338174 Dunham, Letat. Pius 
338176 Dunham, Letal Plus 
338238 Dunham, LetaL Pius 
338277 Dunham, I. etal Pius 
338294 Dunham, Letal Plus 
338316 Dunham, LetaL Plus 

338323 Dunham, I. etal Plus 

338324 Dunham, LetaL Plus 
338386 Dunham, I. etal Plus 
338398 Dunham, Letal Pius 
338410 Dunham, LetaL Pius 
338414 Dunham, I. etal. Plus 
338460 Dunham, !. etal. Plus 
338481 Dunham, Letal. Pius 
338489 Dunham, I. etal. Plus 
338500 Dunham, Letal. Plus 
338514 Dunham, LetaL Plus 
338530 Dunham, Letal Plus 
338620 Dunham, Letal Plus 
338631 Dunham, Letal. Plus 
338653 Dunham, Letal. Plus 



34356420-34356527 

34356683-34356753 

34428228-34428395 

34428521-34428637 

1896402-1896478 

3110198-3110314 

6106904-6106990 

6126661-6126786 

7745284-7745355 

8130457-8130612 

11035818-11035984 

12818687-12818891 

12875843-12875912 

13203550-13203973 

15096270-15096324 

19338177-19338679 

21166580-21166650 

22052874-22052942 

23106433-23106510 

24225887-24225954 

27280182-27280313 

30395182-30395285 

30804624-30804780 

31333399*1333580 

31585902-31586067 

31953012-31953205 

32014049-32014131 

32803968-32804028 

33219714-33219779 

33227865-33227946 

33237292-33237427 

33318571-33318644 

33963188-33963979 

34187269-34187366 

19497-19600 

945236-945452 

1482883-1483016 

3331236-3331313 

3575975*576153 

3865738*865814 

62B637T62B6470 

634303^6343172 

6534661-6534782 

6589383-6589450 

6831483*831620 

7445532-7445633 

7601363-7601520 

7863131-7863310 

12771102-12771268 

12774072-12774223 

14661936-14662015 

16167622-16167962 

16463958-16464539 

17089711-17089988 

17154655-17154792 

17155309-17155574 

18611213-18611407 

18953492-18953581 

19292807-19292916 

19345573-19345660 

20233372-20233488 

20942659-20942873 

21142605-21143049 

21253847-21253974 

21379420-21379655 

21636361-21636509 

23540239-23540334 

23711167-23711241 

24219427-24219509 
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338660 Dunham, I. etal. Plus 24387122-24387266 

338704 Dunham, L etal Pius 25230432-25230548 

338847 Dunham, I. etal. Pius 27995337-27995420 

338887 Dunham, I. etal Plus 28465244-28465384 

5 338895 Dunham, LetaL Pius 28598893-28599135 

338915 Dunham, I. etaL Plus 28824881-28824977 

338925 Dunham,!. etal. Plus 28883892-28884036 

338936 Dunham, LetaL Plus 29148G22-29148160 

338952 Dunham, L etal. Plus 29418831-29418968 

10 338980 Dunham, LetaL Plus 29896789-29896874 

338981 Dunham, I. etal. Plus 29897917-29898008 

338986 Dunham, LetaL Plus 30007287-30007415 

339009 Dunham, I. etaL Plus 30348477-30348598 

339017 Dunham, L etaL Plus 30420896-30421090 

15 339045 Dunham, L etaL Plus 30744286-30744356 

339046 Dunham, 1. etaL Plus 30746269-30746420 

339059 Dunham, L etaL Plus 30814655-30814801 

339067 Dunham, LetaL Plus 30869347-30869412 

339069 Dunham, I. etaL Plus 30880975-30881070 

20 339078 Dunham, L etaL Plus 30914310-30914423 

339084 Dunham, L etaL Plus 30944556-30944803 

339101 DunhanUeLal. Plus 31158047-31158123 

339102 Dunham, LetaL Plus 31169321-31169563 

339103 Dunham,l.etaL Plus 31170343-31170454 
25 339115 Dunham, LetaL Plus 31459869-31459927 

339157 Dunham, LetaL Plus 32131701-32131833 

339166 Dunham, LetaL Plus 32210902-32211006 

339167 Dunham, L etaL Plus 32213567-32213730 
339288 Dunham, LetaL Plus 3316961143169691 

30 339289 Dunham, I. etaL Plus 33186756-33186903 

339291 Dunham, I. etaL Pius 33205057-33205247 

339407 Dunham, LetaL Plus 34189461-34189620 

332865 Dunham, LetaL Minus 1391482-1391218 

332881 Dunham, L etaL Minus 1563520-1563184 

35 332930 Dunham,!. etal. Minus 2022565-2022497 

332931 Dunham, L etal. Minus 2023651-2023562 

332984 Dunham,!. etal. Minus 2632606-2632457 

332986 Dunham, Letat Minus 2635398-2635206 

332997 Dunham, L etal. Minus 2710509-2710375 

40 333051 Dunham, LetaL Minus 2991973-2991840 

333061 Dunham, LetaL Minus 3029631-3029527 

333064 Dunham, I. etal. Minus 3030722-3030623 

333096 Dunham, LetaL Minus 31842344184118 

333099 Dunham, I. etaL Minus 32067964206674 

45 .333106 Dunham, LetaL Minus 32307444230547 

333160 Dunham, LetaL Minus 36548934654678 

333163 Dunham, I. etaL Minus 36651244664962 

333165 Dunham, I. etaL Minus 36740524673905 

333166 Dunham,!. etal. Minus 3694664-3694567 
50 333170 Dunham, LetaL Minus 37333944733299 

333174 Dunham, LetaL Minus 37642844764210 

333188 Dunham, Letat Minus 38269904826863 

333214 Dunham, I. etal. Minus 39665594966437 

333232 Dunham, LetaL Minus 4001551-4001365 

55 333237 Dunham, Letat. Minus 40033264003219 

333239 Dunham,!. etal. Minus 4095861-4094462 

333255 Dunham, Letat Minus 42978834297716 

333259 Dunham, L etal. Minus 43067694306639 

333274 Dunham.1. etal. Minus 43891464388954 

60 333290 Dunham,!. etal. Minus 45307344530554 

333295 Dunham, L etal. Minus 45492904549198 

333296 Dunham,!. etal. Minus 45507664550644 

333310 Dunham, I. etal. Minus 46373154637232 

333311 Dunham, LetaL Minus 46379334637844 
65 333312 Dunham, I. eta!. Minus 46387944638635 

333313 Dunham, Letal. Minus 46393974639277 

333315 Dunham, LetaL Minus 5405980-5405876 

333318 Dunham, I. etal. Minus 46426364642564 

333321 Dunham, LetaL Minus 46490804648934 
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333327 Dunham, I. etal 

333335 Dunham, I. etal. 

333337 Dunham, I. etal 

333454 Dunham, I. etal. 

333456 Dunham, I. etal. 

333459 Dunham, I. etal. 

333470 Dunham, I. etal. 

333493 Dunham, I. etal. 

333496 Dunham, I. etal. 

333498 Dunham, I. etal. 

333510 Dunham, I. etal. 

333546 Dunham, I. etal. 

333561 Dunham, I. etal. 

333738 Dunham, L etal 

333780 Dunham, I. etal. 

333783 Dunham, I. etal. 

333818 Dunham, 1. etal. 

333894 Dunham, I. etal. 

333897 Dunham, I. etal. 

333900 Dunham, I. etal. 

333909 Dunham, tetat 

333936 Dunham, tetat 

333944 Dunham, I. etai. 

334040 Dunham, I. etal. 

334154 Dunham, I. etal. 

334178 Dunham, tetat 

334188 Dunham, tetat 

334273 Dunham, t etal. 

334282 Dunham, tetat 

334285 Dunham, tetat 

334286 Dunham, tetat 
334303 Dunham, tetat 

334305 Dunham, tetat 

334306 Dunham, tetat 
334320 Dunham,!. etal. 

334352 Dunham, I. etal. 

334353 Dunham, tetat 
334359 Dunham, Letal. 
334363 Dunham, tetat 
334365 Dunham, I. etat 
334399 Dunham, tetat 
334409 Dunham, I. etat 
334414 Dunham, tetat 
334470 Dunham, I. etat 
334483 Dunham, tetat 
334489 Dunham, tetat 
334498 Dunham, I. etal. 

334501 Dunham, tetat 

334502 Dunham, I. etat 
334543 Dunham, tetat 
334622 Dunham, tetat 
334650 Dunham, tetat 
334680 Dunham, I. etat 
334745 Dunham, tetat 
334756 Dunham, I. etal. 
334758 Dunham, I. etat 
334761 Dunham, Letat 
334763 Dunham, I. etat 
334784 Dunham, I. etal 
334790 Dunham, tetaL 
334793 Dunham, tetat 
334802 Dunham, tetaL 
334820 Dunham, tetat 
334824 Dunham, tetat 
334832 Dunham, tetat 
334842 Dunham, I. etal. 
334844 Dunham, tetat 
334857 Dunham, tetat 
334927 Dunham, total 



Minus 4657947-4657828 

Minus 4672656-4672564 

Minus 46779304677841 

Minus 5137007-5136880 

Minus 5143942*143806 

Minus 5144548-5144344 

Minus 5223319*223088 . 

Minus 46373154637232 

Minus 5404643*404523 

Minus 5405980*405876 

Minus 5557628*557469 

Minus 5886643*886442 

Minus 5903659-5903590 

Minus 7552160-7552084 

Minus 7750367-7750277 

Minus 7751850-7751777 

Minus 7911959-7911762 

Minus 8188855*188709 

Minus 8194390*194284 

Minus 6200268*200122 

Minus 8229639*229477 

Minus 8512805*512564 

Minus 8557051*556936 

Minus 9342995-9342934 

Minus 10570714-10570572 

Minus 11755052-11754971 

Minus 11925963-11925834 

Minus 13265608-13265522 

Minus 13285293-13285178 

Minus 13289990-13289793 

Minus 13291759-13291569 

Minus 13454331-13454217 

Minus 13456310-13456209 

Minus 13461157-13461049 

Minus 13496857-13496717 

Minus 13675908-13675828 

Minus 13683722-13683596 

Minus 13728664-13728534 

Minus 13740004-13739812 

Minus 13742078-13741971 

Minus 14186289-14186163 

Minus 14195181-14195075 

Minus 14234033-14233932 

Minus 1438958M4389442 

Minus 14428355-14428281 

Minus 14455428-14454288 

Minus 14483789-14483700 

Minus 14487509-14487356 

Minus 14488605-14488526 

Minus 14834496-14834116 

Minus 15191678-15191609 

Minus 15371251-15371178 

Minus 15520047-15519887 

Minus 16049960-16049653 

Minus 16128678-16128528 

Minus 16132368-16132233 

Minus 16138424-16138319 

Minus 16148136-16148077 

Minus 16294548-16294360 

Minus 16307576-16307509 

Minus 16330748-16330681 

Minus 16413158-16413026 

Minus 16764338-16764249 

Minus 16857777-16857674 

Minus 17173957-17173760 

Minus 17464352-17464181 

Minus 17503891-17503768 

Minus 18488368-18488242 

Minus 19988711-19987653 
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334939 Dunham, I. etaL 
334951 Dunham, I. etaL 
334969 Dunham, L etaL 
334972 Dunham, LetaL 
335050 Dunham, I. etaL 
335078 Dunham, I. etaL 
335102 Dunham, I. etaL 
335105 Dunham, LetaL 

335110 Dunham, I. etaL 

335111 Dunham, I. etaL 

335115 Dunham, LetaL 

335116 Dunham, LetaL 

335185 Dunham, LetaL 

335186 Dunham, Letat. 
335230 Dunham, LetaL 
335236 Dunham, LetaL 
335243 Dunham, LetaL 
335249 Dunham, L etaL 
335258 Dunham, LetaL 
335261 Dunham, LetaL 
335276 Dunham, I. etaL 
335279 Dunham, LetaL 

335330 Dunham, LetaL 

335331 Dunham, LetaL 
335334 Dunham, LetaL 
335346 Dunham, I. eta!. 
335349 Dunham, I. etaL 

335611 Dunham, LetaL 

335612 Dunham, LetaL 
335671 Dunham, I. etaL 
335676 Dunham, L etaL 
335680 Dunham, L etaL 
335750 Dunham, LetaL 
335752 Dunham, I. etaL 
335755 Dunham, LetaL 
335767 Dunham, LetaL 
335774 Dunham, LetaL 

335777 Dunham, I. etaL 

335778 Dunham, LetaL 
335797 Dunham, LetaL 
335800 Dunham, LetaL 
335818 Dunham, LetaL 
335834 Dunham, Letat 
335840 Dunham, LetaL 
335844 Dunham, L etaL 
335846 Dunham, LetaL 
335856 Dunham, LetaL 

335887 Dunham, LetaL 

335888 Dunham, I. etaL 

335889 Dunham, LetaL 

335890 Dunham, I. etaL 
335893 Dunham, LetaL 

335895 Dunham, LetaL 

335896 Dunham, LetaL 
335900 Dunham, LetaL 
335907 Dunham, LetaL 
335943 Dunham, LetaL 
335956 Dunham, LetaL 
335959 Dunham, LetaL 
335962 Dunham, LetaL 
336040 Dunham, I. etaL 
336044 Dunham, I. etaL 
336047 Dunham, I. etaL 
336068 Dunham, I. etaL 
336143 Dunham, I. etaL 
336158 Dunham, I. etaL 
336174 Dunham, I. etaL 
336223 Dunham,!. etaL 
336245 Dunham, I. etaL 



Minus 20131162-20131054 

Minus 20147708-20147502 

Minus 20168176-20188020 

Minus 20294734-20294611 

Minus 20884109-20883951 

Minus 21059529-21059458 

Minus 21313841-21313598 

Minus 21320563-21320440 

Minus 21334136-21333811 

Minus 21335946-21335809 

Minus 21388250-21388146 

Minus 21388573-21388414 

Minus 21651593-21651522 

Minus 21656436-21656338 

Minus 21899517-21898678 

Minus 21915016-21914870 

Minus 21933519-21933365 

Minus 21950851-21950669 

Minus 22043431-22043262 

Minus 22063937-22063772 

Minus 22154036-22153937 

Minus 22168834-22168638 

Minus 22556589-22556422 

Minus 22556823-22556708 

Minus 22560390-22560136 

Minus 22641097-22640918 

Minus 22661861-22661271 

Minus 25070825-25070706 

Minus 25072328-25072142 

Minus 25358629-25358533 

Minus 25395274-25395152 

Minus 25402437-25402361 

Minus 25732501-25731972 

Minus 25757026-25756890 

Minus 25763806-25763747 

Minus 25819547-25819218 

Minus 25883733-25883572 

Minus 25885770-25885599 

Minus 25886469-25886334 

Minus 25958182-25958030 

Minus 25985373-25985280 

Minus 26323886-26323744 

Minus 26391707-26391530 

Minus 26420596-26420538 

Minus 26433427-26433344 

Minus 26436727-26436621 

Minus 26662452-26662346 

Minus 26939225-26938782 

Minus 26943037-26942820 

Minus 26946988-26946901 

Minus 26949087-26948665 

Minus 26973898-26973747 

Minus 26975307-26975239 

Minus 26977639-26977558 

Minus 26980354-26980238 

Minus 27013352-27013273 

Minus 27446610-27446378 

Minus 27653729-27653635 

Minus 27682313-27662145 

Minus 27704276-27704144 

Minus 29036458-29036300 

Minus 29043828-29043727 

Minus 29050617-29050466 

Minus 29252077-29251969 

Minus 30135948-30135854 

Minus 30163730-30163610 

Minus 30241988-30241839 

Minus 30816306-30816195 

Minus 31420569-31420509 
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336274 Dunham, I. etaL 
336318 Dunham, I. etaL 
336326 Dunham, 1. etaL 

336339 Dunham, l.etal. 

336340 Dunham, I. eta!. 
336355 Dunham, l.etal. 

336392 Dunham, I. etal. 

336393 Dunham, I. etal 

336394 Dunham, I. etal. 
336400 Dunham, I. etal. 
336402 Dunham, I. etal. 
336413 Dunham, I. etaL 

336424 Dunham, I. etal. 

336425 Dunham, I. etaL 
336437 Dunham, I. etal. 
336447 Dunham, I. etaL 
336449 Dunham,!. etaL 
336466 Dunham, I. etaL 
336492 Dunham, I. etaL 

336511 Dunham,!. etaL 

336512 Dunham, I. etaL 
336520 Dunham, I. etaL 
336522 Dunham, I. etaL 
336524 Dunham,!. etaL 
336527 Dunham, L etaL 
336534 Dunham, I. etaL 
336536 Dunham, Letat 
336542 Dunham,!. etaL 

336556 Dunham, I. eta!. 

336557 Dunham, I. etal. 

336558 Dunham, I. etal. 

336559 Dunham,!. etaL 

336560 Dunham, I. etal. 

336561 Dunham,!. etaL 
336597 Dunham, I. etaL 
336601 Dunham, I. etaL 
336642 Dunham, I. etaL 
336645 Dunham, I. etaL 
336662 Dunham, I. etaL 
336664 Dunham, I. etaL 
336676 Dunham, Letat 
336684 Dunham, I. etaL 
336686 Dunham, I. etal. 
336714 Dunham, I. etaL 
336719 Dunham, I. etaL 
336736 Dunham, L etaL 
336744 Dunham, I. etaL 
336786 Dunham, I. etaL 
336793 Dunham, Letat 
336859 Dunham, I. etal. 
336863 Dunham, t etal. 
336933 Ounham, Letat 
336942 Dunham, I. etal. 
336960 Dunham,!. etal. 
336969 Dunham, Letat 
336971 Dunham, I. eta!. 
337003 Dunham, I. etal. 
337011 Dunham, 1. etal. 
337070 Dunham, I. etaL 
337072 Dunham, Letat 
337086 Dunham, Letat 
337140 Dunham, t etal. 
337193 Dunham,!. etaL 
337256 Dunham, Letat 
337278 Dunham, I. etaL 
337284 Dunham, I. etal. 
337293 Dunham, I. etal. 
337316 Dunham, L etaL 
337326 Dunham, t etal. 



Minus 32085468-32085303 

Minus 33364452-33364338 

Minus 33567328-33567201 

Minus 33798479-33798330 

Minus 33812069-33811915 

Minus 33874750-33874649 

Minus 34015868-34015736 

Minus 34016145-34015951 

Minus 34016457-34016298 

Minus 34023437-34023298 

Minus 34024090-34023981 

Minus 34046702-34046576 

Minus 34055549-34055491 

Minus 34058544-34058446 

Minus 34074154-34074090 

Minus 34198207-34197996 

Minus 34204707-34204577 

Minus 34213195-34213046 

Minus 34255578-34255437 

Minus 34277480-34277351 

Minus 34278373-34278275 

Minus 34319184-3431910! 

Minus 34320169-34320056 

Minus 34321055-34320921 

Minus 34322071-34321966 

Minus 34326797-34326620 

Minus 34327678-34327538 

Minus 34331316-34331183 

Minus 34375244-34374907 

Minus 34375443-34375341 

Minus 34375825-34375698 

Minus 34376430-34376261 

Minus 34376814-34376596 

Minus 34377168-34376928 

Minus 7627912-7627757 

Minus 13265853-13265654 

Minus 1304281-1304212 

Minus 1351268-1351168 

Minus 2158060-2157993 

Minus 1993558-1993481 

Minus 2022565-2022497 

Minus 2158060-2157993 

Minus 2160698-2160486 

Minus 3094026-3093871 

Minus 3331631-3331503 

Minus 4093128-4093041 

Minus 4333001-4332848 

Minus 5419973-5419873 

Minus 5631345-5631237 

Minus 8201756-8201561 

Minus 8396673^396425 

Minus 11760045-11759981 

Minus 12027537-12027455 

Minus 13267243-13267172 

Minus 13725722-13725643 

Minus 13732308-13732221 

Minus 15523541-15523422 

Minus 16106423-16106080 

Minus 19034423-19034321 

Minus 19077452-19077323 

Minus 19657011-19656681 

Minus 22649450-22649388 

Minus 24594969-24594874 

Minus 27659956-27659876 

Minus 28429017-28428848 

Minus 28491414-28491094 

Minus 28846334-28845873 

Minus 29657129-29656997 

Minus 30017199-30017069 
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337382 Dunham, LetaL Minus 31233666-31233579 

337392 Dunham, LetaL Minus 31442311-31442229 

337406 Dunham, LetaL Minus 31864840-31864588 

337412 Dunham, LetaL Minus 31916487-31916312 

5 337419 Dunham, LetaL Minus 32021496-32021170 

337436 Dunham, LetaL Minus 32257869-32257739 

337455 Dunham, LetaL Minus 32434517-32434425 

337509 Dunham, LetaL Minus 33414613-33414498 

337518 Dunham, LetaL Minus 33796750-33796647 

10 337529 Dunham, LetaL Minus 34043668-34043546 

337533 Dunham, I. etal. Minus 34193388-34193261 

337539 Dunham, LetaL Minus 34254490-34254322 

337551 Dunham, L etal. Minus 34524446-34524362 

337553 Dunham,!. etal. Minus 24230-24160 

15 337591 Dunham, I. etal. Minus 1006414-1006184 

337592 Dunham, LetaL Minus 1007791-1007634 

337593 Dunham, LetaL Minus 1009460-1009291 
337607 Dunham, LetaL Minus 1355719-1355637 
337612 Dunham, LetaL Minus 1570235-1570142 

20 337635 Dunham, LetaL Minus 2169690-2169569 

337824 Dunham, LetaL Minus 45595404559266 

337825 Dunham, LetaL Minus 45671554567005 
337850 Dunham, LetaL Minus 6077143-5076943 
337854 Dunham,!. etal. Minus 5153435-5153272 

25 337913 Ounham, LetaL Minus 61498436149786 

337915 Dunham, L etal. Minus 5922748-5922690 

337968 Dunham, I. etal. Minus 7095797-7095680 

338010 Dunham, I. etal. Minus 7754282-7754184 

338012 Dunham, LetaL Minus 7761421-7761351 

30 338017 Dunham, LetaL Minus 7864521-7864401 

338065 Dunham, LetaL tonus 7235048-7234950 

338094 Dunham, I. etal. Minus 9595602-9595440 

338129 Dunham, LetaL Minus 10915338-10915237 

338132 Dunham, LetaL Minus 10989617-10989530 

35 338150 Dunham, LetaL Minus 11478551-11478355 

338157 Dunham. LetaL Minus 11731444-11731375 

338195 Ounham, LetaL Minus 13484103-13483972 

338255 Dunham, I. etal. Minus 15242294-15242231 

338276 Dunham, I. etal. Minus 16109555-16109398 

40 338431 Dunham, L etal. Minus 19747608-19747496 

338448 Dunham,!. etal. Minus 20151152-20151054 

338451 Dunham, LetaL Minus 20174286-20174193 

338477 Dunham, LetaL Minus 20821897^0821838 

338534 Dunham, LetaL Minus 21771238-21771170 

45 338682 Dunham, LetaL Minus 24800712-24800461 

338684 Dunham, LetaL Minus 24827522-24827428 

338689 Dunham,!. etal. Minus 24893073-24892972 

338695 Dunham, LetaL Minus 25104153-25104016 

338825 Dunham, I. etal. Minus 27664798-27664712 

50 338842 Dunham. LetaL Minus 27824238-27824079 

338893 Ounham, LetaL Minus 28491807-28491631 

338904 Dunham. LetaL Minus 28766345-28766253 

338935 Dunham. I. etal. Minus 29071537-29071461 

339022 Ounham, LetaL Minus 30523414-30523289 

55 339034 Dunham. I. etal. Minus 30621603-30621422 

339190 Dunham, LetaL Minus 32403103-32402985 

339212 Dunham, I. etal. Minus 32494335-32494210 

339213 Dunham, I. etal. Minus 32496590-32496440 
339216 Dunham, LetaL Minus 32504250-32504109 

60 339233 Dunham, LetaL Minus 32751331-32751238 

339258 Dunham, LetaL Minus 32934756-32934615 

339262 Dunham, LetaL Minus 32971258-32971090 

339263 Dunham, I. etal. Minus 32974634-32974452 
339265 Dunham, LetaL Minus 32975943-32975806 

65 339338 Dunham, LetaL Minus 33468728-33468606 

339396 Dunham, LetaL Minus 34017306-34017205 

339400 Dunham,!. etal. Minus 34045024-34044940 

339425 Dunham, LetaL Minus 34407911-34407798 

325207 6552430 Plus 140049-140170 
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329568 3962490 
329517 3983513 
325313 5866865 
325327 5866875 
5 325317 5866878 
325257 5866895 
329632 6729060 
325371 5866920 
325375 5866920 
10 325378 5866920 

325469 6017034 

325470 6017034 
325576 6552443 
325505 6682451 

15 325543 6682452 

329635 5302817 

329636 5302817 
325593 5866992 
325675 5867014 

20 325704 5867028 
325662 6138923 
325785 6381957 
325666 6469822 
325818 6682490 

25 329777 6002090 
329768 6015501 
329759 6048280 
329731 6065783 
329687 6117856 

30 329676 6272128 
329667 6272129 

329669 6272129 

329670 6272129 
329641 6468233 

35 329791 6469354 
325826 5867048 
325829 5867052 
329888 6067149 
329893 6525313 

40 329899 6563505 
325988 5867064 
325855 5867067 
325999 5867073 
326001 5867073 

45 325886 5867087 
325882 5867087 
325905 5867104 
325922 5867122 
325937 5867132 

50 325960 5867147 
325961 5867147 

325838 6552452 

325839 6552452 

325840 6552452 
55 325844 6552453 

325870 6682492 
329984 4646193 
329976 4878063 
329935 6165200 

60 329916 6223624 
330021 6671889 
330024 6671908 
330028 6671908 
326033 5867178 

65 326036 5867178 
326056 5867184 
326116 5867193 
326122 5867194 
326138 5867203 



Pius 36331-36750 

Minus 53197-53269 

Minus 27385-28192 

Plus 75189-75264 

Minus 156551-156649 

Plus 10867-10955 

Plus 192813-193017 

Minus 1035422-1035536 

Minus 1165503-1165810 

Minus 1187981-1188167 

Plus 286823-286991 

Plus 287578-287663 

Minus 137769-137894 

Minus 240852-240946 

Plus 151873-152057 

Minus 62522-62622 

Minus 64969-65078 

Minus 469726-469860 

PIUS 955517-955711 

PIUS 156198-156387 

Plus 370618-370763 

Plus 61849-62003 

Plus 16769-16857 

Minus 120278-120559 

Minus 191389-191479 

Plus 118315-118422 

Minus 37647-37730 

Plus 158772-158900 

Minus 22165-22288 

Minus 142207-142359 

Plus 101355-101745 

Plus 131223-131291 

PIUS 131351-131495 

Minus 105995-106107 

Minus 131982-132089 

Minus 46361-46458 

Plus 232674-233060 

Minus 37227-37473 

Minus 166123-166791 

Minus 111058-111783 

Plus 17349-17606 

Plus 276141-276251 

Plus 149115-149192 

Plus 155223-155348 

Plus 194694-194915 

Minus 8178-8347 

Plus 78779-78876 

Minus 329063-329134 

Minus 152633-152902 

Minus 162506-162635 

Minus 165106-165209 

Plus 171451-171532 

Plus 181964-182037 

Plus 184380-184547 

Minus 14188-14332 

Plus 228209-228297 

Minus 139780-139890 

Minus 62584-62691 

Minus 6905949127 

Plus 36396-37195 

Plus 120938-121032 

Minus 1005-1270 

Minus 30015-30144 

Plus 37261-37333 

Minus 120215-120273 

Minus 181553-181690 

Plus 45548-45604 

Plus 144397-144683 

Minus 179374-179436 
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326145 5867204 
326160 5867211 
326201 5867216 
326207 5867222 
5 326226 5867230 
326233 5867232 
326238 5867260 
326241 5887260 
326243 5867261 

10 326251 5867263 
326268 5867267 
326124 5916395 
326339 6056311 
330049 4567182 

15 326358 5867293 
326365 5867297 
326379 5B67327 
326382 5867327 
326390 5867340 

20 326424 5667369 
326453 5867399 
326472 5867404 
326492 5867422 
326533 5867441 

25 330117 6015201 

330115 6015202 

330116 6015202 

330095 6015278 

330096 6015278 
30 326644 5867559 

326713 5867595 
326745 5867611 

326752 5867615 

326753 5867616 
35 326598 5867634 

326667 6552455 
326855 6552460 
326812 6682504 
327005 5867664 

40 327008 5867664 
326896 5867680 
326904 5867684 
326951 6004446 
326941 6004446 

45 326943 6004446 
326928 6456782 

326958 6469836 

326959 6469836 
327039 6531965 

50 327127 6682520 
330158 6580367 
327204 5867447 
327208 5867447 
327266 5867462 

55 327277 5867473 
327289 5867481 
327296 5867492 
327237 5867544 
327145 5867548 

60 327333 5902477 
327335 5902477 
327343 6017017 
327350 6249563 
327358 6552411 

65 327360 6552411 
327409 5867750 
327424 5867751 
327430 5867754 
327470 5867772 



Minus 


52599-52814 


Minus 


182758-183222 


Minus 


166168-166959 


Plus 


48139-48219 


Plus 


52644-52705 


Plus 


124788-124863 


Plus 


64282-64338 


Minus 


181648-181916 


Plus 


123838-123978 


Minus 


82716-82822 


Pius 


122114-122765 


Plus 


407102-407560 


Minus 


164637-165251 


Minus 


314662-315210 

O ItUUfc «J 1 life 1 V 


Plus 

1 IUO 


9122-9195 


Minus 


96630-96764 


Plus 


32299-32402 


Mimic 

Mill tUO 


50420-50503 


Minn*? 


108814-110592 


IVIU IUO 


168329-168409 


rluo 


86999-86493 


riuo 


903739-2Q3940 


rlUO 


190768-120991 


Mimic 
mill Lib 


532153-539280 


Mimic 
mil luo 


7340-7680 


PI HQ 

riuo 


11403-11677 


riuo 


19100-19418 


Pine 


15343-15814 
IJOHJ i«jo IH 


Plus 


49370^9458 


PIUS 
r IUO 


42884-42819 


r iub 


191511-121798 

(£ 13 1 1 l£ 1 iWJ 


Phrc 
riUo 


197130-197318 


Mimic 
mil lUo 


1214-1562 




19454-12511 

i£HwH-f£U 1 1 


Pine 

rluo 


68955-69014 


riuo 


149311-142441 

l*ft«3 1 1 IHfcHH 1 


Mimic 

nuiiuo 


111390-111463 


Pine 

riuo 


18981 1-1 89041 

IOOO I 1" lOOOH 1 


rluo 


61 0847-61 OQ07 


Phic 

riuo 


928737-09881 1 

SCO/Of OfcOOl 1 


Mimic 
muiuo 


12032-12122 


Mini ic 


9280-9806 


PlllC 

riuo 


193812-143998 

1 OOO l£ 1 OvOOQ 


Pine 
riuo 


62018-6289S 


Mimic 


89242-89427 


Minus 


291007-291219 


Minus 
mil iuo 


42952-43082 


Minus 

mil IUO 


43159-43301 


Plus 


694486-694998 


Plus 
riuo 


41925-42083 


Plus 

r IUO 


81966-82456 


Plus 


165135-165239 


Plus 

rlUO 


180805-180864 


Minus 

IY1IIIU9 


82400-62615 


Minus 


165616*165715 


Plus 


49296-49536 


Plus 


7627-8166 


Minus 


59702*9813 


Minus 


40462-40551 


Minus 


141446-141609 


Minus 


142979-143124 


Minus 


12288-12395 


Minus 


41890-41985 


Minus 


3802-3950 


Minus 


6255-6422 


Minus 


52949-53011 


Plus 


160442-160598 


Plus 


1320-1403 


Plus 


150910-150973 
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327460 6004455 
327498 6017023 

327509 6117815 

327510 6117815 
5 327512 6117815 

327535 6525279 
330163 6042042 
330171 6648220 
327579 5867824 

10 327672 5867843 
327629 5867872 
327640 5867890 
327649 5867899 
327612 6525283 

15 327718 6525284 
327801 5867924 

327762 5867961 

327763 5867961 
327776 5867964 

20 327822 5867968 
327823 5867966 
327807 5867968 
327845 6531962 
330228 6013527 

25 330190 6165182 
328122 5868031 
328132 5868038 
328159 5868065 
328168 5868071 

30 328175 5868073 
328217 5868096 
327665 5868130 
327866 5868131 
327870 5868131 

35 327879 5868142 
327902 5868158 
327918 5868165 
327934 5868184 
327959 5868210 

40 327976 5868212 
328020 5902482 
328042 5902482 
328008 5902482 
330301 2905862 

45 330299 2905881 
328274 5868219 
328595 5868224 
328591 5868227 
328668 5868254 

50 328677 5868256 
328687 5868262 
328706 5868270 
328711 5868271 
328730 5868289 

55 328732 5868289 
328734 5868289 
328752 5868298 
328755 5868301 
328761 5868302 

60 328775 5868309 
326784 5868309 
328787 5868309 
328809 5868327 
328829 5868337 

65 328280 5868352 
328311 5868371 
328318 5868373 
328323 5868373 
328348 5868383 



PIUS 175245-175343 

Minus 42178-42283 

Minus 54882-55053 

Minus 56824-56944 

Plus 176256-176325 

Plus 19105-19175 

Minus 20321-20385 

Plus 110889-111575 

Minus 37229-38335 

Minus 69649-69740 

Plus 49692-49811 

Plus 9448-9566 

Plus 205871-205927 

Plus 2747-2924 

PIUS 86123-86186 

Plus 23239-23346 

Minus 50303-50439 

Plus 229347-229476 

Minus 164308-164486 

Minus 168886-169633 

Minus 170359-170433 

Plus 33745-33811 

Plus 193402-193549 

Minus .3719-3787 

Phis 36103-36243 

Plus 158474-158656 

Minus 126737-126839 

Minus 52957-53162 

Plus 60321-60479 

Plus 208-271 

Minus 3742-4362 

Plus 61503-62205 

Minus 2893-3046 

Pius 53558-53757 

Minus 77722-77793 

Minus 133339-133467 

Plus 547530-547591 

Plus 4183042036 

Minus 46497-46682 

Minus 349301-349409 

Minus 556386-556652 

Minus 1985085-1986626 

Pius 296663-297151 

Minus 4420-5781 

Minus 1020-1382 

Minus 31244-31439 

Plus 148738-148967 

Minus 237647-237726 

Minus 10888-10984 

Minus 58708-58950 

Plus 624479-624585 

Plus 165501-165614 

Minus 97797-97990 

Plus 8068-8214 

Plus 37437-37550 

Plus 50559-50747 

Minus 114911-115087 

Minus 145959-146446 

Minus 239308-239412 

Plus 12845-12920 

Minus 74523-74604 

Plus 135772-135963 

Plus 91792-91849 

Plus 36309-36630 

Plus 160563-160631 

Minus 170560-170826 

PIUS 414945415620 

Minus 1080089-1080235 

Minus 260272-260379 
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328377 5868390 


Plus 


16947-17023 


328436 5868417 


Plus 


203760-203904 


328504 5868471 


Plus 


47064-47217 


328506 5866471 


Plus 


60716-60830 


328522 5868477 


Plus 


1972307-1972452 


328525 5868482 


Plus 


12387-14313 


328541 5868486 


Plus 


130956-131050 


328662 6004473 


Plus 


1184773-1184855 


OOftfifiQ fiflnM73 

OtOQOO OUUH** / O 


Plus 


1185279-1186634 




Will \U9 


291716-291948 




Mimic 


3884-3952 




Mrniic 

(¥111 IU9 


428829-428893 


328936 5868500 


Minus 


1352202-1352259 


QOQQQQ fifmddfll 
0£09G9 CUvrt*tO 1 


Minus 


131139-131320 

Ivl 1 Uv IWI V 


00004,1 Rififi7fiS 

0£03HI U*K)OfWiJ 


Minus 


9817-9885 

vU 1 » WWW 


d£0*7*40 OHUU/OJ 


Phic 


28227-28413 


OOOQCO 

D*KJO/ O 


Plus 


117442-118283 


330316 6007576 


Minus 


119761-119931 


OOOOOU OUOOUtUl 


Mimic 
wit icio 


26413-26820 


OOfwni 3056622 


Mimic 


27522-27614 


O0Uo*tO 40*W*/0 


Mini ic 


19055.19969 


39QfttA 5969561 


Mini ic 
milium 


32819-32939 


•xsjUmo ooooooy 


PilLC 


18971-190% 


qonncQ cqmc7A 


PIiiq 
rlUb 


498453-426541 


09Q1fiff CQRR71 1 
OcglOD OODOYll 


Minus 


13103-13225 


000007 KRA870Q 
OZsttOf 00007m 




133233-1 33339 

1 OOC.OO' 100009 




Minus 


222629-222709 


329333 5868806 


Phis 


392666-392746 


329376 5868859 


Plus 


52356-52694 


329384 £668869 


Minus 


116524-116662 


329140 6017060 


Pius 


290842-290905 


329317 6381976 


Plus 


614823-615209 


329319 6381976 


Plus 


721390-721470 


329129 6588026 


Plus 


144569-144712 


329373 6682537 


Minus 


38950-39301 


329412 6682553 


Minus 


68948-69041 


329424 5868879 


Plus 


362196-362344 


329446 5868886 


Plus 


84776-84899 


329449 5868886 


Plus 


97697-97771 
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TABLE 14: shows genes, including expression sequence tags, down-regulated in prostate 
tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 
GeneChip array. Shown are the ratios of "average" normal prostate to "average" prostate 



5 cancer tissues. 



ExAccn: Exemplar Accession number, Genbank accession numbe 

10 UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Background subtracted normal prostate : prostate tumor tissue 

15 '*■» 

331328 AA281133 Hs56808 ESTs 1853 

320875 D60641 Hs.131921 ESTs 1455 

300994 AI251936 Hs.146298 ESTs 12.17 

323461 AM18762 Hs.190044 ESTs A 1055 

20 301015 AA947682 Hs217173 ESTs; Weakly similar to Chain A; Cdc42hs-Gdp Complex [H.sapiens] 10.17 

319419 AA543095 Hs.13848 ESTs; Highly similar to mftogennnduced [M.musculus] 92 

323486 C0527B Hs.166800 ESTs; Moderately similar to [PYRUVATE OEHYDROGENASE(UPOAM(DE)] 

KINASE ISOZYME 4 PRECURSOR [H^apiens] 857 

324882 AW419080 Hs250645 ESTs 8 

25 330569 U57796 Hs57679 zinc finger protein 192 7.88 

330126 CH21j)2gi|6093735 7.8 

316265 AA737400 Hs.142230 ESTs 7.7 

323045 AA148950 Hs.188836 ESTs 7.64 

320668 R58399 Hs.146217 ESTs 7.4 

30 330769 AA465192 Hs.16514 ESTs 7.15 

312614 AI766732 Hs201194 ESTs 7 

314790 AW341754 Hs.189305 ESTs 6.83 

309979 AW452118 Hs257533 EST 6.74 

314236 AA743396 Hs.189023 ESTs 6.49 

35 329192 CHX_hsgi|5868716 6.1 

324307 AA627642 Hs.4994 transducer of ERBB2; 2 (TOB2) 5.99 

303685 AW500106 EST cluster (not In UniGene) with exon hit 5.82 

314921 AW452382 Hs257564 ESTs 5.8 

315840 AA679001 Hs.192221 ESTs 5.68 

40 332776 AA034364 HS256551 ESTs; Weakly similar to Mil ALU CLASS B WARNING ENTRY 111! [H^apiens] 5.43 

313533 AW298141 Hs.157975 ESTs 5.4 

303494 F30712 EST cluster (not in UniGene) with exon hit 5.35 

317490 AI627358 Hs.148367 ESTs 551 

332546 D84454 Hs21899 solute carrier family 35 (UDP-galactose transporter); member2 525 

45 334719 CH22_FGENES.421_30 525 

300679 AA813958 Hs207727 ESTs; Moderately similar to K1AA0071 [Usapiens] 522 

311811 AI625304 Hs.190312 ESTs 522 

315310 AW511298 Hs256067 ESTs 519 

312871 H86747 Hs227602 KIAA1 116 protein 5.11 

50 324715 AI739168 EST duster (not in UniGene) * 457 

313870 AW206435 Hs.146057 ESTs 457 

321453 N50080 Hs.117827 ESTs 4.78 

316160 AW197887 Hs253353 ESTs 4.63 

313833 AA766825 EST duster (not in UniGene) 458 

55 315850 AW270550 Hs.1 16957 ESTs 453 

303124 AF161350 EST duster (not In UniGene) with exon hit 4.46 

323346 AL134932 Hs.143607 ESTs 4.4 

301383 AA913591 Hs.126480 ESTs 455 

324513 AW501678 Hs.164577 ESTs 428 

60 303480 AA331906 EST duster (not in UniGene) with exon hit 425 

323591 AA301270 EST duster (not in UniGene) 422 

313603 AW468119 EST duster (not in UniGene) 42 

317863 A!733395 Hs.129124 ESTs 4.1 

312381 R42049 Hs.195473 ESTs 4.08 

65 317514 AW451570 Hs.126850 ESTs 4.03 

319750 AA621606 Hs.1 17956 ESTs 4.03 

271 



WO 02/30268 PCT/US01/32045 



322520 T55958 EST cluster (not In UniGene) 4 

314754 AW026761 Hs.134374 ESTs 4 

316088 AI990652 Hs208973 ESTs 4 

318473 AI939339 Hs.146883 ESTs 3.96 

5 307848 AI364186 EST singleton (not In UnlGene) with exon hit 3.95 

300730 AW449204 Hs257125 ESTs 3.94 

303034 W60843 H&31570 ESTs 3.93 

324668 AI679131 Hs201424 ESTs 3.9 

324674 AA541323 Hs.115831 ESTs 3.88 

10 300547 N53442 Hs.143443 ESTs 3.83 

316100 AW203986 Hs2130Q3 ESTs 3.79 

314801 AA481027 Hs.127336 ESTs; Weakly similar to ORF YGR245C [S.cerevisiae] 3.75 

320856 D59945 EST cluster (not In UniGene) 3.74 

313188 AI039702 Hs.179573 collagen; type I; alpha 2 3.73 

15 314187 AA804409 Hs.1 18920 ESTs 3.73 

311826 AA765470 Hs.122826 ESTs 3.7 

302358 D81150 EST cluster (not in UniGene) with exon hit 3.68 

311441 Z38720 Hs.151014 ESTs 3.66 

321914 AA011603 EST cluster (not in UniGene) 3.59 

20 332216 H95082 Hs.1 02332 EST 3.52 

324771 AA631739 EST cluster (not in UniGene) 35 

323691 AA317561 EST cluster (not In UniGene) 3.49 

303525 AW516519 Hs.1 15130 ESTs 3.47 

309709 AW242630 EST singleton (not in UniGene) with exon hit 3.46 

25 300038 AFFX control: MuriL4 3.38 

316526 AI088192 Hs.135474 ESTs; Weakly similar to ATP-DEPENDENT RNA HELICASE A [H^apiens] 3.36 

313029 AA731520 Hs.170504 ESTs 3.35 

304356 AA196027 Hs.195188 glyceraldehyde^phosphate dehydrogenase 3.34 

314610 AI948688 Hs.191805 ESTs 3.33 

30 329815 CH.14j)2gil6624888 3.32 

314949 AI745387 Hs239124 ESTs 3.31 

300598 N53574 Hs.158932 ESTs 3.3 

329218 CaX_hsgi|5868726 328 

315706 AW440742 Hs.155556 ESTs 328 

35 303751 AW503637 EST cluster (not In UniGene) with exon h'rt 325 

307783 AI347274 EST singleton (not in UniGene) with exon hit 325 

321414 AA324975 Hs.128993 ESTs; Weakly similar to K1M0465 protein [H.sapiens] 325 

312187 AA700439 Hs.188490 ESTs 325 

334061 CH22_FGENES.327J4 323 

40 336036 CH22_FGENES.678_7 323 

321477 H67818 Hs222059 ESTs 321 

315760 AW139383 Hs245437 ESTs 32 

316733 AA811713 Hs.163222 ESTs 32 

300855 AW235248 Hs.79828 ESTs 32 

45 323611 AA304986 Hs.145704 ESTs 3.19 

314138 AA740616 EST duster (not in UnlGene) 3.17 

316774 AA814859 EST cluster (not in UniGene) 3.16 

308884 AI833131 Hs.179100 ESTs 3.11 

331317 AA258222 H&87757 ESTs 3.1 

50 317221 AI989538 Hs.191074 ESTs 3.08 

316386 AA749062 Hs.180285 ESTs 3.08 

321040 H26953 EST cluster (not in UniGene) 3.08 

308828 AI824829 EST singleton (not In UniGene) with exon hit - 3.08 

300778 AA236233 Hs.188716 ESTs 3.07 

55 316667 AW015940 Hs232234 ESTs 3.07 

324614 AW5Q3101 EST cluster (not In UniGene) 3.07 

316468 AW293046 Hs255158 ESTs 3.07 

300671 AI239706 Hs.189886 ESTs 3.06 

314301 AW297967 Hs.188181 ESTs 3.05 

60 312335 AW043620 Hs236993 ESTs 3.03 

322957 AA247755 EST duster (not in UniGene) 3.01 

316848 AA830053 Hs.126798 ESTs 3.01 

313473 AA009660 Hs251948 ESTs; Moderately similar to T07O3.7 [C.elegans] 2.99 

318518 T27119 EST cluster (not in UniGene) 2.98 

65 313383 AI076370 Hs.134037 ESTs 2.97 

331389 AA458637 Hs.152207 ESTs 2.96 

304257 AA053294 EST singleton (not In UniGene) with exon hit 2.95 

309917 AW340014 EST singleton (not in UniGene) with exon hit 2.95 

319661 H08035 Hs21398 ESTs; Moderately similar to PUTATIVE GLUCOSAMINE-6-PH0SPHATE 
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321253 AI699484 
321193 AA149508 
332864 
300027 

M11507 
324330 AA8B4766 
320014 AA137114 



10 318885 Z43272 
AI04Q125 
AA233056 
AA825148 



318146 
323348 
305703 
335882 
317672 
323416 
312652 
324094 
319761 
317013 
317383 



312479 
332808 
311824 
321992 
316074 



312071 
312684 
332668 
322139 
304168 
325602 
319885 
300611 
316854 
318208 
331623 
324616 



AW205409 

A1610397 

AI419909 

AA382603 

R84237 

AA864468 

AA913887 

AW277121 

A1950844 

AW293826 

C06003 

AW517542 

AW296076 

AA683529 

AW294O20 

AA062971 

H53744 

H77679 

R59096 
N75450 
AA831215 
A1091458 
R38715 



40 

304968 AA614308 
314912 AI431345 
300767 AW193466 
313463 AI057369 
45 320600 AA135565 
301160 AI308989 
324825 AA704457 
300336 AW292417 
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317850 
339047 
324580 
321142 
319478 
300793 
313733 
326505 
314987 
303114 
318709 
312878 
329224 
328018 
323231 
312887 
315183 
300259 
313240 
316697 



N29974 

AA492588 

A1817933 

R06841 

A1248571 

AA836116 

AW015506 
AF090948 
H24244 
A1209108 



AA324437 

AW157377 

AW1 36134 

AI479011 

AI743261 

AW293174 



ISOMERASE [Rsapiens] 2.95 

EST cluster (not In UnlGene) 2.93 

Hs.103288 ESTs 2.93 

CH22_FGENES.28_4 2.92 

AFFX control: transferrin receptor 2.91 

EST cluster (not In UniGene) 2.88 

Hs.170291 ESTs 2.88 

CH22_FGENES.296_5 2.88 

EST cluster (not In UniGene) 2.87 

Hs.150521 ESTs 2.87 

Hs.191518 ESTs 2.85 

Hjl21229 F-boxproteinFbw1b 2.84 

CH22_FGENES.629_7 233 

Hs.127748 ESTs 2.82 

Hs.159560 ESTs 2.81 

Hs.160994 ESTs 2.81 

EST cluster (not in UniGene) 231 

EST cluster (not In UniGene) 2.8 

Hs.135646 ESTs 2.8 

Hs.126511 ESTs 2.78 

Hs.254881 ESTs ^ 2.78 

Hs/128738 ESTs; Weakly similar to non-lens beta gamma-crystallln like protein [Rsapiens] 2.77 

CH22J=GENES7_10 275 

H&250610 ESTs 275 

Hs.1 16456 ESTs 2.73 

Hs.208382 ESTs 2.73 

EST singleton (not in UniGene) with exon hit 2.73 

Hs.143119 ESTs 2.73 

Hs.1 17721 ESTs 2.72 
Hs.181 161 ESTs; Weakly similar to INHIBITOR OF APOPTOSIS PROTEIN 1 [M.musculus] 2.72 

EST cluster (not in UniGene) 2.72 

EST singleton (not in UniGene) with exon hit 272 

CH.13_hsgij5866994 271 

Hs.136698 ESTs 271 

EST cluster (not in UniGene) with exon tut 271 

Hs.159066 EST s; Weakly similar to predicted using Genefinder [Celegans] 2.69 

Hs.134559 ESTs 2.68 

Hs.1 53529 Homo sapiens done 24540 mRNA sequence 2.68 

Hs.162000 ESTs 2.68 

EST singleton (not in UniGene) with exon hit 2.67 

Hs.161784 ESTs 2.67 

Hs.136525 ESTs 2.67 

Hs.122536 ESTs 2.65 

Hs.250739 ESTs 2.65 

Hs.156939 ESTs 2.65 

Hs.255738 ESTs; Moderately similar to gag [Rsapiens] 2.65 
Hs.255074 ESTs; Moderately similar to high-risk human papilloma viruses E6 

oncoproteins targeted protein E6TP1 alpha [H.sapiens] 2.64 

EST cluster (not in UniGene) 2.64 

CH22_DA59H18.GENSCAN.287 2.64 

EST cluster (not in UniGene) 2.63 

HS209584 ESTs 2.62 

EST cluster (not in UniGene) 2.62 

Hs.186837 ESTs 2.61 

EST cluster (not in UniGene) 2.6 

CH.19Jisgi[5867435 2.6 

Hs.130730 ESTs 2.6 

EST cluster (not in UniGene) with exon hit 2.59 

Hs240763 ESTs; WeaWy similar to prediction 2.58 

Hs.143946 ESTs 2.57 

CHJLhsgi|5868728 2.56 

CH.06_hsgi|5902482 2.56 

Hs.177230 ESTs 255 

Hs.132910 ESTs 2.55 

Hs.220277 ESTs 2.55 

Hs.170783 ESTs 254 

Hs.131860 ESTS 254 

Hs*52627 ESTs 253 
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302697 



313966 AI807551 
331263 AA015718 

310683 AW055233 
AA085996 
AJ0014O8 
AI613519 
AFD86538 
AA974253 
10 323208 AA203415 
W76005 
AA243617 
AA256675 
AI624497 
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322347 
316240 



321643 
330723 
323455 
308383 
328744 
332344 
328121 
321915 
314954 
302821 
329454 
336605 
300664 
323362 
300024 
325026 
324510 
313389 
301309 
313570 
316504 
319401 
312827 
327871 
337173 
302948 
324303 
315527 
315979 
331310 
321095 



313035 
322114 
313671 
303211 
301256 
338165 
324692 
318587 
312378 
318625 
305181 
300815 
324063 
315859 
305092 
306598 
300307 
321348 
325112 
336679 
321383 
337357 
300680 
327120 
302761 
312132 



W45574 

AI670955 
AA521381 
AA188868 



A1444628 

AL135067 

M10098 

AI671168 

A1148353 

A1765182 

M78276 

AA041455 

AW135854 

R01342 

AI744361 



AA465635 

AL1 18754 

AI791138 

AA830515 

AA253351 

AA017595 

AI701559 

N36417 

AA643791 

W49823 

AA099548 

AA932948 

AA557952 

AA779704 

R41582 

T48446 

AA663726 

AA286678 

AW292740 

AA682305 

AA642912 

AI000320 

AI651016 

Z49979 

AI903770 

AJ002574 

AW468066 

AW250553 

A1475490 

AA827652 



Hs.189061 ESTs 2.53 
ze31a12.s1 Scares retina N2b4HR Homo sapiens cDNA clone 

IMAGE365743 , l mRNA sequence . 2^1 

Hs.160870 ESTs 2.5 

Hs.248572 Human PAC ctone DJ404F18 from Xq23 2.5 

EST cluster (not in UniGene) wiih exon hit ZJS 

EST singleton (not in UniGene) with exon hit 2.49 

EST cluster (not in UniGene) 2.49 

Hs.120319 ESTs 2.49 

Hs.136200 ESTs 2.48 

Hs.32094 ESTs 2.48 

Hs.31082 ESTs; Highly similar to db83 [Rjtorvegicus] 2.48 

Hs.200438 ESTs; Weakly similar to atypical PKC specific binding protein [Rnorvegicus] 2.47 

EST singleton (not in UniGene) with exon hit 2.47 

CH.07_hsgi|5868290 2.47 

Hs^52497 ESTs 2.47 

CH.06_hsg!|5868031 2.47 

Hs.200151 ESTs 2.46 

Hs.187726 ESTs 245 

Hs.173933 ESTs; Weakly similar to NUCLEAR FACTOR 1/X [H.sapiens] 2.45 

CH.Y_hsgi|5868887 2.45 

CH22JGENES.420_4 2.45 

Hs.256809 ESTs 244 

Hs.1 17182 ESTs 2.44 

AFFX control: 18S ribosomal RNA 2.44 

Hs.12285 ESTs 243 

Hs.120849 ESTs 2.43 

Hs.1 19903 ESTs 2.43 

Hs.255917 ESTs 2.43 

Hs.209312 ESTs 2.43 

Hs.132458 ESTs 242 

EST cluster (not in UniGene) ' 2.42 

Hs.205591 ESTs;WeaWysinfctozirtcfingerprotBlnPng r 1[M.musculus] 2.42 

CH.06__hsgi|5868131 2,41 

CH22_FGENES565-3 2.41 

EST cluster (not in UniGene) with exon hit 2.41 

EST cluster (not in UniGene) 2.4 

Hs.116768 ESTs 24 

Hs.222917 ESTs 2.4 

Hs.44439 STAT induced STAT inhibitor^ 2.4 

Hs.32844 ESTs 2.4 

EST singleton (not in UniGene) with exon hit 2.39 

Hs.144928 ESTs 2.37 

Hs.191740 ESTs 2^7 

Hs.145553 ESTs 2.37 

Hs.191436 ESTs;HighIysitnHarto(U1118D244[H^apiens] 2.37 

EST cluster (not in UniGene) with exon hit 2.36 

CH22 EM^C005500.GENSCAN212-3 2.36 

EST duster (not In UniGene) 2.35 

Hs.168830 ESTs 2.35 

Hs.109219 retinal degeneration B beta 2.35 

Hs.193162 ESTs 2.35 

Hs.1 16922 EST 2.35 

EST cluster (not in UniGene) with exon hit 2.34 

Hs.254815 ESTs 2.34 

Hs.133268 ESTs 2.33 

EST singleton (not in UniGene) with exon hit 2.33 

EST singleton (not in UniGene) with exon hit 2.33 

Hs£46311 ESTs 2.33 

EST cluster (not In UniGene) 2.33 

Hs.124344 ESTs 2.32 

CH22_FGENES.43-7 2.32 

EST cluster (not in UniGene) 2.32 

CH22_FGENES.73T>6 2.31 

Hs.257712 ESTs;WeaWyslm11artoKIAA0986protelnlH^apiens) 2.31 

CH.21Jisgi|6531970 2.31 

EST cluster (not in UniGene) with exon hit 2.3 

Hs.170577 ESTs 2.3 

EST cluster (not In UniGene) 2.3 
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312189 T95594 Hs.187435 ESTs 2.3 

306537 AA991705 EST singleton (not in UniGene) with exon hit 2.3 

327061 CH21_hsgi|6531965 2.3 

315391 AA759098 Hs.1920G7 ESTs 2.3 

322384 AI968646 Ks.33862 ESTs 229 

323206 AA203339 Hs220750 ESTs 229 

318110 AI680915 Hs201379 ESTs 228 

335250 CH22_FGENES.516J 1 228 

331696 Z38907 Hs.91662 KIAA0888 protein 228 

318327 AW294013 Hs200942 ESTs 228 

324980 AA969121 Hs254296 ESTs 228 

319429 A1608881 Hs.11482 ESTs; Highly similar to Junctional adhesion molecule [H.saplens] 228 

310601 AI970543 Hs.192605 ESTs 228 

318905 Z43395 EST duster (not In UniGene) 228 

323442 AA252753 Hs.164039 ESTs 227 

304428 AA342250 Hs.99819 ubiquMn specific protease 16 227 

313352 AW292127 Hs.144758 ESTs 227 

316491 AA766025 Hs238794 EST 227 

317751 AI697668 Hs202241 ESTs 226 

314136 AA229781 Hs221962 ESTs 226 

306665 AI004614 Hs.130577 EST 226 

303946 AW474196 Hs221604 ESTs 225 

313435 AA769123 EST duster (not In UniGene) 225 

317679 AA968799 Hs.1 50289 ESTs 225 

322370 AA330095 EST cluster (not in UniGene) 225 

306620 A1000929 EST singleton (not in UniGene) with exon hit 224 

329109 CH.XJisgi|5868626 224 

311043 AI871209 Hs.177128 ESTs 224 

300228 AM58372 Hs.158748 ESTs; Weakly similar to synapsin lb [M.muscu!us] 224 

307223 AI193698 Hs.184776 nbosomal protein L23a 224 

309023 AI888045 EST singleton (not in UniGene) with exon hit 223 

310749 AI493675 Hs.170332 ESTs 223 

316769 AI914939 Hs2121B4 ESTs 222 

320409 AA356195 EST cluster (not in UniGene) 221 

333149 CH22_FGENES.87_8 221 

324951 M86125 Hs.137487 ESTs 221 

321939 AI791617 Hs.145068 ESTs 22 

320594 A1863952 Hs.169436 arginyttransferase 1 22 

320722 R67430 Hs.172787 ESTs 22 

321781 D78667 EST cluster (not in UniGene) 22 

328903 CH.08Jtsgi|5868514 22 

303889 T19204 EST duster (not In UniGene) with exon hit 22 

325045 T08845 EST duster (not in UniGene) 22 

312828 AI865455 Hs211818 ESTs; Moderately similarto HO ALU SUBFAMILY J WARNING ENTRY HI! [Ksapiens] 2.19 

335109 CH22J=GENES.494J5 2.18 

330878 AA131471 Hs.71440 ESTs 2.18 

311289 AI971362 Hs231945 ESTs 2.18 

304608 AA513456 EST singleton (not in UniGene) with exon hit 2.18 

337393 CH22_FGENES.747-4 2.18 

332812 CH22_FGENES.7J4 2.18 

327665 CH.04_hsgi|5867839 2.18 

314581 AW504859 Hs237849 ESTs 2.17 

326508 CH.19_hsgi|6682496 2.17 

301242 AW161535 Hs258803 ESTs 2.17 

312780 AI765651 Hs.172900 ESTs 2.17 

315954 AW276810 Hs254859 ESTs 2.16 

311179 AI880843 Hs223333 ESTs 2.16 

315320 AI084182 Hs.186895 ESTs 2.16 

313017 AI015203 Hs.1 18015 ESTs 2.16 

312430 AW139117 Hs.117494 ESTs 2.15 

300864 AA406539 Hs.190958 ESTs 2.15 

314753 AA463262 EST duster (not In UniGene) 2.15 

322574 AF156548 EST duster (not in UniGene) 2.15 

321409 C03864 EST duster (not In UniGene) 2.15 

321205 AA002047 EST duster (not In UniGene). 2.14 

320406 AA353895 Hs.152983 HUS1 (S. pombe) checkpoint homolog 2.14 

337646 * CH22 u .EM^C000097.GENSCAN.11-2 2.13 

303084 AF174008 EST duster (not m UniGene) with exon hit 2.13 

312185 AA654772 Hs.186564 ESTs 2.13 
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314465 
318168 



320712 
318487 
317462 
304384 
314544 
319881 
328078 
317354 
308617 
311568 
313605 



AI066544 

AA602917 

A1821782 

A1800041 

R66867 

AI167877 

AW015206 

AA235482 

AA399018 

T72744 

AW090770 

AI738720 

AW439969 

AI761786 

AA848118 



AW296067 
AW149321 
AA640770 
AA347452 
AW450674 



AI052795 

AW503733 

AA670480 

AA693880 

AW445167 

AW408683 

AI678183 

AA120970 

R62925 

AA290875 

AJ215643 

W23285 

AA282197 

AA994530 

AI298794 

A1493742 

AW294522 

AW245528 

AA1 37062 

AI989942 

AI682303 
AA249018 



N27448 

AI274307 
AL134620 
R21945 
AA502583 

AW175841 
AW168096 

AI828174 
AI370434 



314569 AA813784 
332783 W45302 
315259 AA701499 



332933 
325498 
313659 
324596 
324783 
302696 
313418 
326920 
327574 
323207 
303753 
305235 
316055 
317194 
319565 
335146 
301475 
312442 
322502 
303693 
310179 
321121 
331330 
306557 
317865 
318667 
318042 
323818 
331286 
311262 
335601 
311351 
312996 
328190 



328227 
331481 
335288 
307513 
323316 
319479 
303482 
327489 
323935 
309575 
337043 
312897 
307881 



Hs.156974 
Hs.220587 
Hs.190555 

Hs.143716 
Hs.178784 
Hs.62954 
Hs.250835 



Hs.192271 

Hs.218177 
Hs.204674 
Hs.221216 



Hs.124106 
Hs.105411 



Hs.1 14696 



Hs.192201 
Hs.170315 



Hs.126036 
Hs.32922 

Hs.170917 

Hs.143199 

Hs.243665 

Hs.30120 

Hs.171381 

Hs.89002 

Hs.129130 
Ks.165210 
Hs.149991 
Hs.134754 
Hs.103853 
Hs.232150 

HSJ201274 



Hs.43944 



Hs£56153 
Hs.197271 

Hs.192183 
Hs.195188 

H&227049 



Hs.123001 
Hs.87889 
Hs.148115 



EST singleton (not in UniGene) with exon hit 
ESTs 

ESTs; Moderately similar to 
ESTs 

EST cluster (not in UniGene) 

ESTs 

ESTs 

ferritin; heavy polypeptide 1 
ESTs 

EST cluster (not in UniGene) 

CH.06Jisgi]5868008 

ESTs 

EST singleton (not in UniGene) with exon hit 

ESTs 

ESTs 

ESTs 

CH22J=GENES.38_7 

CH.12_hsgiJ5866967 

ESTs 

ESTs 

EST cluster (not in UniGene) 

EST cluster (not In UniGene) with exon hit 

ESTs 

CR21_hsg1|6456782 

CH.03_hsgi|5867818 

ESTs 

ESTs 

EST singleton (not In UniGene) with exon hit 

EST cluster (not in UniGene) 

ESTs 

ESTs 

CH22J : GENES.499_2 

prostaglandin E receptor 3 (subtype EP3) 

ESTs 

ESTs 

ESTs 

ESTs 

EST cluster (not in UniGene) 

ESTs; Highly similar to CGI-07 protein [H.sapiens] 

EST singleton (not in UniGene) with exon hit 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

CH2^FGENES^81_41 
ESTs 

EST cluster (not In UniGene) 

CH.06_hsgi|5868077 

CH22_EM^C005500.GENSCAN.148-16 

CH2£JGENES.301_6 

CH.06_hsgi|5868105 

EST 

CH22.FGENES.527J 

EST singleton (not in UniGene) with exon hit 

EST cluster (not in UniGene) 

ESTs 

ESTs 

CH.02_hsgij6004459 
ESTs 

glyceraldehyde-3-phosphate dehydrogenase 

CH22 FGENES.439-19 

ESTs 

EST singleton (not in UniGene) with exon hit 

CH.07_hsgl|6004473 

ESTs 

helicase-mol 
ESTs 



2.13 
2.12 

ALU SUBFAMILY SC WARNING ENTRY HI! [Ksaplens] 

2.11 



2.12 



2.11 

2.11 

2.11 

2.11 

2.1 

2.1 

2.1 

2.1 

2.09 

20)9 

2X9 

2X8 

2.08 

2.08 

2.08 

2.08 

2.07 

2.07 

2.06 

2.06 

2X6 

2.06 

2.05 

2.05 

2.05 

2.05 

2.05 

2X5 

2.04 

2.04 

2.04 

2X4 

2.03 

2.03 

2.03 

2X3 

2.03 

2.02 

2.02 

2.02 

2.01 

2X1 

2X1 

2X1 

2X1 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

1.99 
1.99 
1.99 
1.98 
1.98 
1.98 
1.98 
1.98 
1.98 
1.98 
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313171 
318060 
332256 
312110 
335864 
320389 
314065 
323086 
323919 
310750 
309435 
300129 
320130 
323787 
338112 
313625 
325240 



N67879 
AI241421 
N66393 
A1962180 

W00545 

AA868267 

H15474 

AA862973 

AI373163 

AW090537 

AW028820 

AI820675 

AW373446 

AW468402 

AA412102 



300279 
326023 
321609 
324183 
336276 
334913 
325417 
318489 
318455 
306890 
315073 
321289 
308521 
306382 
331320 
324279 
309577 
327014 
303488 
306561 



313083 
327752 
318674 
301267 



H86021 
AA402453 



AW043590 

AI148763 

A1092235 

AW452948 

R84687 

AI689808 

AA968967 

AA262999 

AA501412 

AW168753 

AW025860 
AA995223 
AA019806 
N50545 

AA295490 

AW297762 

AA608787 

AL036947 

AA317554 

AI765013 

A1246374 

AA322155 

AW296132 

AA489697 

AW518573 

AA354549 

AW450967 
AW207642 
AI031771 

AA405696 



315278 AI985544 
325824 

316277 AA737780 
323181 AA418583 
301438 AA961643 
307050 AI147341 



321452 
311483 
300976 
323715 
313800 



304013 
322019 
334150 
310094 
316218 
324774 
326507 
314570 



Hs.157695 ESTs 
HS.132236 ESTS 
Hs.102754 ESTs 
Hs.226803 ESTs 

CH22_FGENES.629_9 
Hs.171785 ESTs 
Hs.85524 ESTs 

Hs.12214 Homo sapiens clone 23716 mRNA sequence 
Hs.220704 ESTs 
Hs.170333 ESTs 

EST singleton (not in UniQene) with exon hit 
EST duster (not in UniGene) wfth exon hit 
ESTs 



1.97 
1.97 
1.97 
1.97 
1.97 
1.97 
1.96 
1.96 
1.96 
1.96 
1.96 
1.96 
1.95 



Hs.203804 

Hs.169885 ESTs; Weakly similar to cDNA EST EMBLT02216 comes from this gene [C.elagans] 1.95 



HS254020 
Hs.250911 

AW237425 Hs.253817 



Hs.198800 
Hs.1 13011 



HS225023 



Hs^57631 
Hs.226306 



Hs.42788 
Hs.191688 



Hs.129559 
Hs.108447 
Hs.159200 



CH2a_BflAC005500.GENSCAN.185-24 
ESTs 

CH.10_hsgil5866848 
interleukin 13 receptor; alpha 1 

za21(9^1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA done 

IMAGE293225 3\mRNA sequence 

ESTs 

CK17jlsgi|5867245 

ESTs; Weakly similar to hMmTRAlb [Rsapiens] 
ESTs 

CH22 FGENES.762J5 
CH2£FGENES.456J 
CH.12_hs gl|5866925 
ESTs 

EST duster (not In UniGene) 

EST singleton (not in UniGene) with exon hit 

ESTs 

ESTs 

EST singleton (not in UniGene) with exon hit 
EST singleton (not in UniGene) with exon hit 
ESTs 

ESTs; WeaJdy similar to Pro-PoWUTPase polyprotein [M.musculus] 
EST singleton (not in UniGene) with exon hit 
CH.21_hsgi|5867664 
EST duster (not in UniGene) with exon hit 
EST 



Hs.255690 
Hs.1 12590 



Hs.209128 
Hs.185861 

Hs.166674 
Hs.145053 
Hs.156110 
Hs.41181 

Hs.235240 
Hs.174021 
Hs.132586 



Hs.1 16429 

Hs.213392 
Hs.143621 
Hs.127716 
Hs.146734 



ESTs 
CH.05Jlsgil5867949 
EST duster (not in UniGene) 
ESTs 
ESTs 

EST duster (not in UniGene) 
EST duster (not in UniGene) 
ESTs 
ESTs 

EST duster (not in UniGene) 
ESTs 



Immunoglobulin kappa variable 1 D-8 

Homo sapiens mRNA; cDNA DKFZp727C191 (from clone DKf=Zp727C191) 

CH22_FGENES.339J 

ESTs 

ESTs 

ESTs 

CH.19__hsgi|5867435 
EST cluster (not in UniGene) 
CH22^FGENES.758J 
ESTs 

CH.15JisgI|5867048 

ESTs 

ESTs 

ESTs 

EST 

EST singleton (not in UniGene) with exon hit . 
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1.95 
1.95 
1.95 
1.95 

1.95 

1.95 

1.95 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.93 

1.93 

1.93 

1.93 

1.93 

1.93 

1.93 

1.92 

1.92 

1.92 

1.92 

1.92 

1.01 

1.91 

1.91 

1.91 

1.91 

1.91 

1.91 

151 

151 

151 

151 

15 

15 

1.9 

15 

1.9 

15 

15 

15 

15 

15 

15 

1.89 

1.89 

1.89 



WO 02/30268 



PCT7US01/32045 



302426 AL049925 Hs225984 DKFZP547G0910 protein 1.89 

320127 H72615 Hs.17268 ESTs 1.89 

337736 CH22^EM^C000097.GENSCAN.1(K>.2 1.89 

331319 AA262755 Hs.194264 ESTs 1.88 

5 310767 AI377S05 Hs.158835 ESTs 1.88 

314880 AI732169 Hs.105429 ESTs 1.88 

312539 AI004377 Hs.200360 ESTs 1.88 

309674 AW2Q5604 Hs.1 68034 ESTs; WeaWy similar to 111! ALU SUBFAMILY SP WARNING ENTRY !!!! [H.sapiens] 1.88 

314621 AI627478 Hs.187670 ESTs 1.88 

10 319495 AI972146 Hs.192756 ESTs 1.88 

313472 AA007374 EST cluster (not in UniGene) 1.88 

302705 U09060 EST cluster (not in UniGene) with exon hit 1.88 

329511 CH.10_p2gi|3983514 1.88 

317140 AI699412 HS201925 ESTs 1.87 

15 302598 AI815985 Hs.1 29683 ublquitin-conjugating enzyme E2D 1 (homologous to yeast UBC4/5) 1.87 

301 153 AA725670 Hs.120485 ESTs; WeaWy similar to serine/threonine kinase with SH3 domain; leucine 

zfcper domain and proline rich domain [H^apiens] 1 .87 

332222 N28271 Hs.176618 ESTs 1.87 

330703 AA055475 Hs.104143 dathrin; light polypeptide (Lea) 1 .87 

20 318470 AI159863 Hs.143713 ESTs 1.87 

314014 AW291847 Hs.121715 ESTs; WeaHy similar to HP protein [Ksaplens] 1.87 

300370 AI827817 EST cluster {not In UniGene) with exon hit 1.86 

312329 R84768 Hs.13399 Homo sapiens done 25032 mRNA sequence 1.86 

325587 CH.iajisgil6682462 1.86 

25 310237 AI884313 Hs.158906 ESTs 1.86 

318872 R13085 EST duster (not In UniGene) 1.86 

303431 AA317915 EST duster (not in UniGene) with exon hit 1.86 

338427 CH22_EM:AC005500.GENSCAN.349-1 1.86 

300452 AI352293 Hs.191098 ESTs 1.85 

30 . 321279 H85330 Hs.146060 ESTs 1.85 

301690 F05865 Hs.249180 ubiquitin-conjugating enzyme EE 2 (homologous to yeast UBC4/5) 1.85 

307932 AJ230822 EST singleton (not in UniGene) with exon hit 1.85 

318292 AI679966 Hs.150603 ESTs 1.85 

310254 AI239811 Hs.157491 ESTs 1.85 

35 311790 AW016437 H&233462 ESTs 1.84 

314248 AA278347 Hs.126078 ESTs 1.84 

335586 CH22_FGENES.581_25 1.84 

339209 CH22_FF113D11.GENSCAN.64 1.84 

307954 AI419692 EST singleton (not In UniGene) with exon hit 1.84 

40 302549 AF055136 H&248162 tectorin alpha 1.84 

321629 H87213 Hs.158092 ESTs 1.84 

301239 AA807558 EST duster (not in UniGene) with exon hit 1.84 

332434 N75542 Hs.75356 transcription factor 4 1.84 

327192 CH.01_hsgi|5867445 1.83 

45 310214 AI220072 Hs.165893 ESTs 1-83 

320516 R33857 Hs.181479 ESTs; WeaWy similar to E-SELECTIN PRECURSOR [H.sapiens] 1.83 

324231 W60827 EST duster (not In UniGene) 1.83 

336616 CH22_FGENES.613_5 1.83 

328799 CH.07_hsgi|5868316 1.83 

50 324661 AW504161 EST duster (not m UniGene) 1.83 

313190 AA766707 Hs.153039 ESTs 1-83 

301979 L28168 Hs.121495 potassium voftage-gated channel; Isk-related family; member 1 1.82 

302099 AL021397 Hs.1 37576 ribosomal protein L34 pseudogene 1 " 1-82 

320187 T99949 EST duster (not in UniGene) 1.82 

55 320791 R78808 Hs.93961 ESTs; Weakly similar to Ul! ALU CLASS A WARNING ENTRY III! [H.sapiens] 132 

305733 AA829535 Hs.84298 CD74 antigen (invariant polypept of MHC; class II antigen-assodated) 1.82 

308280 AI569349 Hs.180920 rlbosomaJ protein S9 1-81 

321533 W78877 Hs.40111 ESTs 1-81 

312946 AI915122 Hs£04087 ESTs; Weakly similar to F33D1 1.9b [Celegans] 1.81 

60 319474 H90265 Hs.100636 ESTs 1.81 

329519 CH.10j)2gi|3983510 1 81 

324685 AA220982 EST duster (not In UniGene) 1.81 

320697 N62937 Hs.139181 ESTs 1.81 

329246 CHXJusgl|5868732 1-81 

65 332000 AA481271 Hs.193945 ESTs 1.81 

310811 AW20990 Hs.161303 ESTs 181 

325866 CH.16_hsgi|5867076 1-81 

322064 Z78343 EST duster (not in UniGene) 1.8 

333712 CH22_FGENES.251J 1.8 
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313457 AA576052 Hs.193223 ESTs 1.8 

321591 H85687 Hs.1 17927 ESTs 1.8 

330260 CH.Q5j>2gil6671884 18 

311080 AI656320 Hs.197711 ESTs 1.8 

329522 CH.10_p2gi|3983507 1.8 

322889 AA081924 Hs.211417 ESTs 1.8 

300175 AI275011 Hs.204877 ESTs 1.8 

330976 H20560 Hs.244624 ESTs 1.8 
300208 AI341180 Hs.196115 ESTs; Weakly similar to FIBRILLIN 1 PRECURSOR [H.sapiens] 1.79 
319635 R17531 EST duster (not in UniGene) 1.79 
313454 AA730673 Hs.188634 ESTs 1.79 
303093 AI4O0310 Hs.148958 ESTs 1.79 
309815 AW292760 EST singleton (not in UniGene) with exon hit 1.79 
326506 CH.19_hsgil5867435 1.79 
319845 AA649011 Hs.187902 ESTs 1.79 
300290 AI623739 Hs.186387 ESTs 1.79 
312180 AI248285 Hs.118348 ESTs 1.79 
313058 D81015 Hs.125382 ESTs 1.79 
330120 CR19_p2gi|6671864 1.78 
328412 CH.07_h$gil5868405 1.78 
302345 NMJW0565 EST cluster (not in UniGene) with exon h'rt 1.78 
308100 AI475949 EST singleton (not in UniGene) with exon hit 1.78 
311386 AW205705 Hs.207514 ESTs 1.78 
330282 CH.G5_p2gi|6671910 1.78 
318856 Z43011 Hs.21169 ESTs 1.78 
312486 AA845630 Hs.117904 ESTs 1.78 
325450 CH.12-hs gi|5866941 1.78 
321206 H54178 Hs.226469 ESTs 1.78 

330977 H20826 HS31783 ESTs 1.78 
303487 AA333666 EST duster (not In UniGene) with exon hit 1.77 
310398 AI264671 Hs.164166 ESTs 1.77 
313230 AI540166 Hs.129563 ESTs 1.77 
317747 AI683782 Hs.128245 ESTs 1.77 
303381 AL038841 Hs.163313 ESTs; Weakly similar to ALU SUBFAMILY SB WARNING ENTRY fill |H.sapiens] 1.77 
336123 CH^FGENESJOI^ 1.77 
300185 AI286182 Hs.208484 ESTs 1-77 
316002 AW451733 Hs.119824 ESTs 1.77 
319850 AA001811 Hs.83722 ESTs 1.77 
329941 CH.16_p2gi|6165199 1.77 
328329 CHj07_hsgi|5868375 1.77 
322934 AI493054 Hs.158968 ESTs 1.77 
325902 CH.16_hsgi|5867101 1.76 
322239 W01813 Hs.12109 WD40 protein Ciaol 1.76 
303530 AI274851 Hs258744 ESTs 1.76 
300980 AI025527 Hs222097 ESTs 1-76 
331909 AA437300 Hs.178210 ESTs 1.76 
321553 H92449 Hs:116406 ESTs 1.76 
301618 T52760 EST duster (not in UniGene) with exon hit 1.76 
319592 AA627356 Hs.163315 ESTs 1.76 
318511 T26528 Hs227175 ESTs; Weakly similar to HI! ALU SUBFAMILY SQ WARNiNG ENTRY III! [H.sapiens] 1.76 
327183 CH.0t_hsgi|5867442 1.76 
313516 AA029058 Hs.135145 ESTs 1.76 
318644 AI752482 EST duster (not m UniGene) ' 1.76 
321632 AA419617 EST duster (not in UniGene) 1.76 
324657 AW451142 Hs.255628 ESTs 1.76 
300437 AW449374 Hs257149 ESTs 1.75 
319775 AA504429 Hs.6211 methyK^G binding domain protein 1 1.75 
314775 A1149880 Hs.188809 ESTs 1.75 
337460 CH22^FGENES.780^ 1.75 
309849 AW297444 EST singleton (not in UniGene) with exon hit 1.75 
301471 AA995014 Hs.129544 ESTs; Weakly similar to ORF YLL027w [S.cerevisiae] 1.75 
312739 AI318426 Hs.155925 ESTs 1.75 
319995 H15355 Hs.60887 ESTs 1.75 
326495 CH.19Jisgi|5867423 1.75 
337497 CH22_FGENES.801-4 1.75 
322633 AA004534 Hs.153981 ESTs 1-75 
332177 F10812 Hs.101433 ESTs 1.75 
326930 CU21_hsgI|6456782 1.75 
316893 AA837332 EST duster (not in UniGene) 1.75 
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324826 AA7O4806 Hs.143842 ESTs 175 

311269 A1656924 Hs.174257 ESTs 1.75 

309375 AW075342 EST singleton (not in UniGene) with exon hit 1.75 

314171 AI821895 Hs.193481 ESTs 1.75 

5 311684 AI990741 Hs.252809 ESTs 1.75 

334387 CH22_FGENES.380_1 1.75 

312195 AI300101 Hs.252222 ESTs 1.75 

315707 A1418055 Hs.161160 ESTs 1.74 

324349 AW501470 EST cluster (not in UniGene) 1.74 

10 300724 AI762929 Hs£06134 ESTs; Weakly similar to similar to reverse transcriptase [Celegans] 1.74 

309906 AW339340 EST singleton (not in UniGene) with exon hit 1.74 

303714 AW501336 EST cluster (not in UniGene) with exon hit 1.74 

318704 Z24981 EST duster (not in UniGene) 1.74 

303027 AF111178 EST duster (not in UniGene) with exon hit 1.74 

15 322601 W92924 EST duster (not in UniGene) 1.74 

319382 H93199 Hs.33665 ESTs 1.74 

315858 AA737345 EST duster (not in UniGene) 1.74 

332243 N55484 Hs320540 ESTs; Highly similar to ARYL HYDROCARBON RECEPTOR NUCLEAR 

TRANSLOCATOR [H.sapiens] 174 

20 330951 H02566 Hs.191268 Homo sapiens mRNA;cDNADKFZp434N174 (from done DKFZp434N174) 174 

324044 AL045752 Hs211519 ESTs 173 

320630 AA199847 EST duster (not in UniGene) 1.73 

327288 CH.0Lhsgil5867481 173 

314986 AI201367 Hs.142860 ESTs 173 

25 319078 H17255 Hs.144515 ESTs 173 

326278 CH.17Jisgi|5867269 1.73 

302552 H49792 EST duster (not in UniGene) with exon hit 173 

322322 AF086431 EST duster (not In UniGene) 1.73 

327075 CH.21_hsgi|6531965 173 

30 317392 AI797588 Hs.145459 ESTs 1.73 

300810 AI076890 Hs.186949 ESTs 1.73 

315978 AA830893 Hs.119769 ESTs 173 

323903 AA773580 Hs.193598 ESTs 1.73 

330803 AA004699 Hs.150580 putative translation initiation factor 1.73 

35 309845 AW296802 Hs.255580 EST 1.73 

314963 AI689617 Hs.200934 ESTs 1.73 

311710 F09774 Hs.175971 ESTs 173 

315315 AI984592 Ks.15088 ESTs 173 

300378 AA663560 Hs.235873 ESTs; Weakly similar to K1 1C4.2 [Celegans] 173 

40 316141 AW303457 EST duster (not in UniGene) 1.72 

319826 T71739 Hs.75442 albumin 172 

312961 AI033922 Hs.122517 ESTs 1.72 

334379 CH22_FGENES.379J1 1.72 

305854 AA862733 EST singleton (not in UniGene) with exon hit 172 

45 313031 N34927 Hs.186566 ESTs 1.72 

329728 CH,14_p2gi]6065785 1.72 

312090 N57692 Hs.118064 ESTs 1.72 

323341 AL134875 Hs.192386 ESTs 1.72 

302077 AA310580 Hs.132898 Homo sapiens (^romosome11;BACCIT-HSP-311e8(BC269730) 

50 containing the hFEN1 gene 1 71 

310766 AI971438 Hs.158824 ESTs 171 

311450 AI809985 Hs^03340 ESTs 171 

311792 AW238064 Hs.253909 ESTs * 171 

321500 H71999 EST duster (not in UniGene) 171 

55 311948 T78791 Hs^41569 ESTs; Moderately smlr to till ALU SUBFAMILY SQ WARNING B1TRY I!!! [H sapiens] 1 71 

302270 R56151 EST duster (not in UniGene) with exon hit 171 

329089 CHX_hsgi|5868614 171 

322331 AF086467 EST duster (not in UniGene) 171 

318235 AI080361 Hs.l34217 ESTs 171 

60 304561 AA489792 EST singleton (not In UniGene) with exon hit 171 

312681 A1028149 Hs.193124 pyruvate dehydrogenase kinase; isoenzyme 3 171 

310250 AI47B629 Hs.158465 ESTs % 171 

338178 CH22_Bd:AC005500/5ENSCAri219-6 171 

338910 CH22J5J32I10.GENSCAN.11-2 171 

65 321225 AL080073 Hs.251414 Homo sapiens mRNA;cDNADKFZp564B1462 (from done DKFZD564B1462) 17 

322289 AA534550 Hs.539 ribosomal protein S29 17 

319802 AI701489 Hs.202501 ESTs 17 

314022 AW452420 Hs.248678 ESTs 17 

314937 AA515602 Hs.152330 ESTs 17 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



303344 AA255977 Hs.250646 



Hs.146315 
Hs204096 



65 



300580 M761322 Hs220538 ESTs t.7 

304398 AA262785 EST singleton (not in UriiGene) with exon hit 1.7 

313421 AW339515 Hs.163700 ESTs 17 

309763 AW270182 EST singleton (not In UniGene) with exon hit 1.7 

322092 AF085833 EST duster (not In UniGene) 1.7 

315603 AA764768 Hs.121158 ESTs 1-7 

325031 T08597 EST duster (not in UniGene) 1.7 

327157 CR01_hsgi|5866841 17 

. 314809 AI741461 Hs.161904 ESTs 1.7 

320361 H67220 Hs.14640B nitrilase 1 1.69 

324721 AW402302 Hs.43616 ESTs 1.69 

CR07_hsgi|5868246 1-69 

ESTs; Highly similar to ubiquitin-conjugatlng enzyme [M.muscuius] 1 .69 

CH.08JisgiI6456775 1 69 

ESTs 1.69 

RpophBin B (uteroglobin family member); piostatein-like 1.68 

EST cluster (not in UniGene) 1 .68 

EST singleton (not in UniGene) with exon hit 1.68 

Hs57697 hyaiuronan synthase 1 1.68 

Hs.125286 ESTs 168 

CH.07.hsgi|5868485 1.68 

EST duster (not in UniGene) 1 68 

Hs.137154 Homo sapiens mRNA full length insert cDNA done EUROIMAGE 35971 1 .68 

Hs.110853 ESTs; Weakly similar to R10D12.12 [Celegans] 1.68 

Hs.157757 ESTs 1.68 

Hs.242463 keratin 8 1.68 

CH22_EM:AC00550aGENSCAN.39(MO 1.68 

Hs.1 10347 Homo sapiens mRNA for alpha integral binding protein 80; partial 1 .68 

EST cluster (not in UniGene) with exon hit 1 -68 

Hs.21062 ESTs 1-68 

Hs.145946 ESTs 1.68 

Hs.185980 ESTs 1.68 

CH.14_hsgil6381953 1-67 

EST duster (not In UniGene) 1 -67 

EST duster (not in UniGene) 1 .67 

EST duster (not in UniGene) 1 .67 

Hs.43149 ESTs 1- 67 

Hs.207407 ESTs 1.67 

EST duster (not in UniGene) 1 .67 

CH.08_p2gi|5932415 1.67 

CHXJisgQ58686Q2 1.67 

CH22J=GENES.318_3 1.67 

Hs.128457 ESTs 1.67 

EST duster (not in UniGene) 166 

Hs.17385 ESTs 1-66 

CH.12_hsgi|5866941 1-66 

H&232100 ESTs 1-66 

CH.16Jlsgi|5867160 1.66 

EST singleton (not in UniGene) with exon hit 1 -66 

H^224630 ESTs 166 

EST duster (not in UniGene) 1.66 
Hs.170437 ESTs; Weakly similar to hyperpolarization-acuVated; cyclic 

nudeotide-gated channel 2 [H.sapiens] " 1 .66 

EST duster (not in UniGene) 1.66 

CH22J=GENES.581_4 1.66 

Hs.118112 ESTs 166 

CH22JDA59H18.GENSGAN.3-1 1 .65 

CH J6_p2 Qi]6623963 1.65 

ESTs 1.65 

CH22J=GENES.395_9 1.65 

AI064824 Hs.193385 ESTs 1-65 

AW204480 Hs.253414 EST 165 

309518 AW148928 Hs£48895 EST 1.65 

307965 AI421641 EST singleton (not in UniGene) with exon hit 1.65 

316787 AW369770 Hs.130351 ESTs 1.65 

300835 AA401858 H&224843 ESTs 1.65 

338763 CH22 EMAC005500.GENSCAN517-16 1.65 

303327 AA232729 Hs.154302 ESTs 1.65 

313231 AW139993 Hs.163682 ESTs 1.65 



315702 AA657501 
302385 AJ224172 
319699 R14537 
309506 AW137700 
330417 D84424 
315296 AA876905 



AA354146 
AL079289 
A1927068 
AI472124 
AI273815 

AA195405 

R05385 

Z42977 

AW244073 

AW137772 

AL080280 

T58960 

AA249037 

AA424754 

AI797592 

AA081820 



AI801500 
AF088106 
R73816 

AW452184 

AI185234 
AA524545 
W21298 
AI457946 



320303 
302967 
310695 
307512 
338506 
331722 
301431 
318853 
323032 
317538 
325780 
321739 
319808 
313443 
331366 
316443 
322878 
330320 
329081 
334026 
317791 
322235 
331148 
325452 
315106 
326014 
307130 
300943 
319402 
310889 



55 335568 



60 



323371 AL135118 
335568 
320654 AW263086 



330002 
315343 
334487 
312169 



AW205477 Hs.179891 
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334073 CH2£JGENES.327J8 1.65 

319901 T77138 Hs.8765 RNA heficase-related protein 1.65 

326530 CH.19Jhsgl|5867441 1.65 

301126 AI802877 Hs310843 ESTs; Weakly similar to (U1039K52 [H.saptens] 1.65 

5 314043 AA827082 EST cluster (not to UniGene) 1.65 

304387 AA236027 EST singleton (not in UniGene) with exon hit 1.65 

322932 AA099732 EST cluster (not in UniGene) 1.65 

337272 CH22_FGENES.660-1 1.64 

332694 AA262768 Hs,2439Q1 KIAA1 067 protein 1.64 

10 318996 Z44266 EST cluster (not in UniGene) 1.64 

315336 AW342028 Hs.256112 ESTs 1-64 

313329 AW293704 Hs.122658 ESTs 1.64 

318088 AW295409 Hs.137945 ESTs 1-64 

313835 AI538438 Hs.159087 ESTs 1-64 

15 320035 AA378974 Hs.130720 ESTs; Weakly similar to CELLULAR NUCLEIC ACID BINDING PROTEIN [Rsapiens] 1.64 

309372 AW074330 EST singleton (not in UniGene) with exon hit 1.63 

324157 AW402236 EST cluster (not in UniGene) 1.63 

323929 AA354940 Hs. 145958 ESTs 1.63 

302490 AA885502 Hs.1 87032 ESTs 1.63 

20 333942 CH2£_FGENES.301_8 1.63 

327469 CH.02Jtsgi|5867772 163 

301918 AA476777 EST cluster (not in UniGene) wifli exon hit 1.63 

315664 AI744068 Hs.160712 ESTs 1^3 

304405 AA282572 - EST singleton (not in UniGene) with exon hit 1.63 

25 310624 AI341594 Hs.157522 ESTs; Moderately similar to env protein [H^apiens] 1.63 

319250 F11623 EST cluster (not in UniGene) 1.63 

310608 AI962234 Hs.196102 ESTs 163 

317348 AI348076 Hs.831 34iydroxymeth^3^ethylgluteryK)oenzyme A lyase (hydroxymethylglutaricaciduria) 1.63 

306513 AA989230 EST singleton (not in UniGene) with exon hit 1.63 

30 320807 AA086110 Hs.188536 Homo sapiens clone 24838 mRNA sequence 1.63 

303710 A1269069 Hs.250852 ESTs; Highly similar to ubiquitin hydrolyzing enzyme I [Rsapiens] 1.63 

328291 CH.07_hsgi|5868363 1.63 

304236 W93278 EST singleton (not in UniGene) with exon hit 1.63 

317683 AI791700 Hs.127893 ESTs 1-63 

35 311960 AW440133 Hs,189690 ESTs 1.62 

312834 AI028309 Hs.114246 ESTs 1.62 

325326 CH.11_hsgi|5866875 1-62 

313663 AI953261 Hs.169813 ESTs 1.62 

327526 CH.02Lhsgil6381882 1.62 

40 300429 AW449679 Hs.156739 ESTs; Highly similar to XG GLYCOPROTEIN PRECURSOR [H.sapiens] 1.62 

305169 AA663131 EST singleton (not in UniGene) with exon hit 1.62 

316621 AI021996 Hs.122138 ESTs 1-62 

329666 CH.14j)2gil6272129 1.62 

318035 AI744130 Hs.131201 ESTs 1-62 

45 300492 AL031709 multiple UniGene matches 1.62 

316532 A1307229 Hs.184304 ESTs 1.62 

332048 AA496019 Hs.201591 ESTs 1-62 

307113 AI183686 EST singleton (not in UniGene) with exon hit 1.62 

319127 N49476 EST cluster (not in UniGene) 1.62 

50 331155 R87650 Hs.33439 ESTs; Weakly similar to IIH ALU SUBFAMILY J WARNING ENTRY !!!! [Ksapiens] 1.61 

338220 Cr^EM:AC005500.GEN$CAN.246-9 1.61 

315763 AW515270 Hs.118342 ESTs 1-61 

323571 AA984133 Hs.153260 oObl-interacting protein 1-61 

312240 R28628 Hs.203669 ESTs 1-61 

55 304569 AA490934 EST singleton (not in UniGene) with exon hit 1.61 

313179 AI076101 Hs.131704 ESTs 1.61 

326858 CH20Jsgi]6552462 1.61 

317276 AI823847 Hs.129986 ESTs 1.61 

312572 AA350125 Hs.187499 ESTs 161 

60 311932 AW451654 Hs.257482 ESTs 161 

302103 AA452310 Hs.26090 ESTs; Weakly similar to T20B12.1 [Celegans] 1.61 

308413 A1636253 Hs.196511 EST 1-61 

310077 A1620617 Hs.148565 ESTs 1.61 

337780 CH22 EMAC000097.GENSCAN.121-2 1.61 

65 327796 CH.05~hs gi|5867982 1.61 

308352 AI610791 EST singleton (not in UniGene) with exon hit 1.61 

324539 A1378032 Hs.125892 ESTs 1.61 

303232 AA437414 EST cluster (not In UniGene) with exon hit 1.61 

337884 CH22_EMAC0Q550aGENSCAN.54-2 1.61 
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303620 AA397546 Hs.1 19151 ESTs 1.61 

303481 AA336839 EST duster (not in UniGene) with exon hit 1.61 

314481 AA548589 Hs.105846 ESTs 1.61 

300327 A1908894 Hs245893 ESTs 1.6 

323473 AA262442 EST cluster (not in UniGene) 1.6 

326154 CH.17_hsgl|5867170 1.6 

331920 AA446885 Hs.99087- ESTs; Moderately similar to ZINC FINGER PROTEIN 141 [H^apiens] 1.6 

323827 AW406878 EST duster (not in UniGene) 1.6 

322452 W56710 EST duster (not In UniGene) 1.6 

310597 AI739071 Hs.158515 ESTs 1.6 

307871 AI368665 EST singleton (not in UniGene) with exon ha 1.6 

322215 AF088005 EST duster (not in UniGene) 1.6 

318420 AI139857 Hs.143837 ESTs 1.6 

332217 H98987 Hs.102383 EST 1.6 

324937 M79230 Hs.192398 ESTs 1.6 

320543 AF052176 Hs.158529 Homo sapiens done 24457 mRNA sequence 1.6 

300674 AW467388 EST duster (not in UniGene) with exon hit 1.6 

315193 AI241331 Hs.131765 ESTs 1.6 

319713 R24204 EST duster (not in UniGene) 1.6 

301210 AI379982 Hs.158944 ESTs 1.6 

309365 AW072861 EST singleton (not in UniGene) with exon hit 1.6 

321403 AW451454 H&247568 adenylate kinase 3 1.6 

321908 AA376936 H&20998 ESTs 1.6 

303349 AA382661 EST duster (not in UniGene) with exon ha 1.6 

324338 AL138357 Hs547514 ESTs 1.6 

310599 AW300144 EST duster (not m UniGene) 1.6 

333193 CH22_FGENES.98J5 1.6 

336433 CH22.FGENES.825J2 1.6 

312097 AI352096 Hs.157169 ESTs 1.6 

311445 AW204237 Hs.192703 ESTs; Weakly similar to III! ALU SUBFAMILY J WARNING ENTRY 01! [tisaplens] 1.59 

317736 AI361722 Hs.192410 ESTs 1.59 

308147 AI498991 EST singleton (not In UniGene) with exon hit 1.59 

313489 AA017492 Hs.135655 ESTs 1.59 

316289 AA902488 Hs.122952 ESTs 139 

326983 CH.21_hsgi|5867657 159 

314781 AW205298 Hs.202372 ESTs 1.59 

328397 CH.07Jisgi|5868397 159 

331970 AA461084 Hs.187677 ESTs 159 

321744 N91419 Hs.12028 ESTs 159 

310509 AI292181 Hs.150036 ESTs 159 

315921 AI147545 Hs.114172 ESTs 159 

322049 AI928242 Hs.144383 ESTs 159 

301161 AA731518 EST duster (not in UniGene) with exon hit 159 

300548 AI026836 Hs.114689 ESTs 159 

319142 F07366 EST duster (not in UniGene) 159 

313526 AW152263 Hs£49243 ESTs 159 

305937 AA883238 EST singteton (not in UniGene) with exon hit 158 

330123 CH.19_p2gi|6671869 158 

327819 CH.05J1S gi|5867968 158 

318250 AI476814 Hs.134603 ESTs 158 

306760 AI034094 Hs.169476 tubulin; alpha; ubiquitous 158 

322358 AA220235 Hs.246836 ESTs 158 

317666 A169Q269 Hs.201345 ESTs 158 

320725 AA703319 Hs.120967 ESTs 158 

311332 AW292247 Hs.255052 ESTs 158 

334893 CH22J=GENES.452_7 1.58 

318730 AA398215 EST duster (not in UniGene) 158 

315889 AW271639 Hs.221744 ESTs 158 

303702 AW500748 Hs.224961 ESTs; Weakly similar to 73 kDA subunit of deavage and polyadenylation 

spedfichV fador [H.sapfens] 1 57 

315086 AI492660 Hs.170935 ESTs 157 

332514 AA156499 Hs.8454 protein kinase; cAMP-dependent; regulatory; type II; alpha 157 

335549 CH22_FGENES576J0 157 

329532 CH.10_p2gi|3983505 1.57 

323140 AA180467 EST duster (not in UniGene) 1.57 

313166 AI801098 Hs.151500 ESTs 157 

337896 CH22^EM:AC0G5500.GENSCAN56-3 157 

330658 AA319514 Hs£11093 ESTs 157 

324585 AI823969 Hs.132678 ESTs 157 
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317151 


AW298195 


308818 


AI819700 


326547 




318833 


H06234 


320488 


R31386 


306929 


AI124514 


338083 




316B68 


AI660898 


310937 
328638 


A1472880 


310074 


Al 651039 


327058 




320076 


AI653733 


322345 


AF086529 


314731 


AI745498 


318687 


H49619 


303841 


AI934464 


302370 


AJ009849 


322571 


AF156271 


318050 


AI052093 


303388 


AL039604 


323758 


AAB33858 


328369 
329415 




303915 


AW468839 


338794 




303074 


AA243481 


318807 


F08434 


334287 




311928 


AW024798 


304592 


AA505833 


300785 


AA682913 


304921 


AA603092 


324605 


AW502851 


324473 


AW501163 


300566 


H86709 


314165 


AA761265 


302868 


M157392 


314034 


AI299137 


325389 
331849 


AA417078 


320536 


AA331732 


303347 


AA258033 


315769 


AA744875 


317031 


AA973297 


300203 


AI827065 


304037 


T26438 


322613 


AW160507 


317987 


AW138174 


322313 


AF086386 


323992 


AW411383 


325303 




312701 


AI457663 


304787 


AA582678 


305849 


AA861571 


314557 


AA401367 


316507 


AI381515 


315023 


AA533505 


314920 


AA513406 


323097 


Z44354 


325043 


W27919 


307892 


AI376086 


324573 


AA491600 


313092 


A1923673 


324696 


AA641092 


303019 


AF098363 


317158 


A1459140 


309536 


AW151933 


301568 


AI146423 



Hs.255735 ESTs 
Hs.208231 EST 

CH.19^hsg([5867307 
Hs.24888 ESTs 

EST duster (not in UniGene) 

EST singleton (not in UniGene) with exon hit 

CH22_EM:AC(W5500.GENSCAN.174-1 
Hs.195602 ESTs 
Hs.170480 ESTs 

CH.07_hsgil6004473 
Hs.148559 ESTs 

CU21JisgiI6531965 
HS504079 ESTs 

EST cluster (not in UniGene) 
Hs£04579 ESTs 
Hs.127301 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.199297 Homo sapiens GNAS1 gene encoding NESP55 

EST duster (not in UniGene) 
Hs.133132 ESTs 

EST duster (not in UniGene) with exon hit 

EST duster (not in UniGene) 

CH.07 hsgi|5868388 

CH.Yhsgii5868874 
Hs£57767 EST 

CH2^EMtAO)Q5500.GENSCAN528-1 
Hs.127320 ESTs; Weakly similar to WAA0346 [Ksaptens] 

EST duster (not In UniGene) 

CH22_FGENES.369_17 
Hs^33374 ESTs 
Hs.162017 EST 

Hs.247179 ESTs; Weakly similar to KIAA0319 [H^apiens] 

EST singleton (not in UniGene) with exon hit 
Hs.249978 ESTs 

EST duster (not in UniGene) 
Hs.21371 son of seventess (Orosophila) homolog 1 
Hs.221281 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.154214 ESTs 

CK12_hsgiI5866921 
Hs.193767 ESTs 
Hs.137224 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.189413 ESTs 
Hs.126101 ESTS 
H^224877 ESTs 

EST singleton (not in UniGene) with exon hit 

EST duster (not in UniGene) 
Hs.130651 ESTs 

EST duster (not in UniGene) 
Hs.169688 ESTs 

CH.11_hsgi|5866908 
Hs.128127 ESTs 

EST singleton (not in UniGene) with exon hit 

EST singleton (not in UniGene) with exon hit 
Hs.128647 ESTs 
Hs.158381 ESTs 
Hs.185844 ESTs 
Hs.152307 ESTs 

Hs. 180950 guanine nucleotide binding protein (G protein); q polypeptide 
Hs.32944 inositol polyphosphates-phosphatase; type I; l07kD 
Hs.158759 EST 
Hs.161942 ESTs 
Hs.212627 ESTs 
Hs.257339 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.129109 ESTs 

EST singleton (not in UniGene) with exon hit 
Hs.146709 ESTs 



1.57 
157 
1.57 
1.57 
157 
157 
157 
157 
157 
157 
156 
1.56 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
1.56 
156 
156 
156 
155 
155 
155 
155 
155 
155 
155 
155 
1.55 
155 
155 
155 
155 
155 
155 
155 
155 
155 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
153 
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315674 AA651923 
321861 N79341 
310890 AI184510 



AA843868 
AA972712 
R51361 
AA663591 



316907 
312299 
331128 
305177 
337685 
335290 



307944 
300867 
335320 
329841 
317916 
332901 
305413 
316707 
313693 
316101 
320796 
307451 



AI858667 
AI418246 
AW340374 



AA724659 

AI016387 

AW469180 

AA922236 

AF038966 

A1248615 



331482 
318059 
325958 
315736 
314740 
314117 
301646 
338752 
309314 
301445 



N27515 
AI023175 

AA664265 
AW015667 
AA224368 
AA313954 

AW009312 
AI208364 



308501 A1685263 
312330 AA635305 
318040 A1018150 



AW189460 
AW407585 



325701 
315009 
303121 
309271 
328385 
307700 
314591 
304484 
304382 
304232 
309853 
312504 
313134 
330391 
314342 
305977 
301165 

300613 
324124 
308037 
323909 
315464 
306700 
337976 
306855 
311045 
315010 
310205 
310759 



AI318545 

AW103292 

AA432067 

AA232873 

W52674 

AW298169 

AW207346 

N63406 

AP015950 

AI873046 

AA887293 

N85769 

AI932294 

A1554212 

A1458207 

AL043148 

AW139500 

A1022056 

AI083982 

AI569399 

AA531082 

AW025248 

AW135924 



Hs.191850 

Hs.143728 

Hs.190567 
Hs.174818 
Hs.23423 



Hs.121033 



AI565071 Hs.159983 



Hs.184406 
Hs.170651 
HSL221037 
Hs.184543 

Hs.152060 

Hs.40296 

Hs.167022 

HS230213 
Hs.1 19427 
Hs.185164 



Hs.128233 

HS201150 
Hs.121574 
Hs.148781 



Hs.208358 
H&27769 



H&245328 
Hs.258373 



Hs57553 

Hs.143202 

Hs.258697 

Hs.115256 

Hs.258775 

HS224155 



EST duster (not in UnJGene) 
ESTs 

CH.17j>2gi|6042048 

ESTs 

ESTs 

ESTs 

EST singleton (not in UniGene) with exon hit 

CH22_EM:AC000097.GENSCAN.77-1 

CH22_FGENES527_3 

EST singleton (not in UniGene) with exon hit 

EST singleton (not in UniGene) with exon hit 

neural precursor cell expressed; developmental down-regulated 1 

CH22J=GENES.534_7 

CH.14j>2gil6672062 

ESTs 

CH22J=GENES.36_2 

EST singleton (not in UniGene) with exon hit 

ESTs 

ESTs 

ESTs 

secretory carrier membrane protein 1 

EST singleton (not in UniGene) wKh exon hit 

ESTs 

ESTs 

ESTs 

CH.16Jisgi|5867142 

ESTs 

ESTs 

ESTs 

EST cluster (not in UniGene) with exon hit 

CH22_EM:AC0055(X).GENSCAN513-10 

EST singleton (not in UniGene) with exon hit 

ESTs; Weakly similar to REGULATOR OF CHROMOSOME 

CONDENSATION [H^aptens] 

EST 



ESTs 

CH2a_FGENES719_10 
CH.14Jisgi|5867028 
ESTs 

ESTs; Weakly similar to mCAC [KLmuscutus] 
EST singleton (not in UniGene) with exon hit 
CH.07Jisgi|5868395 
EST singleton (not in UniGene) with axon hit 
ESTs 
ESTs 

EST singleton (not in UniGene) with exon hit 
EST singleton (not in UniGene) with exon hit 
tousled-like kinase 2 
ESTs 
ESTs 

tela me rase reverse transcriptase 
ESTs 

EST singleton (not in UniGene) with exon hit 
ESTs; Weakly similar to PTERIN-4-ALPHA-CARBlNOLAMlNE 
DEHYDRATASE [H^apiens] 

ESTs; Weakly similar to B-CELL LYMPHOMA 6 PROTEIN [H^apiens] 



153 
1.53 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
153 
152 
152 
152 
152 
152 

152 
1,52 
152 
152 
152 
152 
152 
152 
152 
152 
152 
152 
152 
1.52 
152 
152 
152 
152 
152 
151 
151 

151 
151 



H&249604 

Hs.185664 EST s; Weakly similar to SERINE/THREONINE-PROTEIN KINASE NRK2 (H .sapiens] 151 
H&174181 
Hs.186257 
Hs.1 16135 



Hs.174746 
H&240049 
Hs.202445 
H&224883 



ESTs 
ESTs 
ESTs 

EST singleton (not in UniGene) with exon hit 

CH22_EM:AC005500.GENSCAN.107-1 

EST singleton (not In UniGene) with exon hit 

ESTs 

ESTs 

ESTs 

ESTs 



151 
151 
151 
151 
151 
151 
151 
151 
151 
151 
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310954 


AW449044 


Hs.171298 ESTs 


151 


312019 


T77046 


Hs.1 88750 ESTs 


151 


334773 




CH22J=GENES.430_5 


151 


332043 


AA490831 


Hs.125056 ESTs 


151 


322950 


AA296219 


EST duster (not in UniGene) 


151 


337920 




CH22_EMAC005500.GENSCAN.67-3 


151 


328993 




CHX)9_hsgi|5868536 


151 


309245 


AI972447 


EST singleton (not in UniGene) with exon hit 


151 


312172 


AI222168 


Hs.191168 ESTs 


151 


304039 


T47349 


EST singleton (not in UniGene) with exon hit 


15 


301329 


AI149653 


Hs.190496 ESTs 


15 


313376 


AI949246 


Hs.200381 ESTs 


15 


324248 


AW504918 


EST cluster (not in UniGene) 


15 


308771 


AI809301 


EST singleton (not in UniGene) with exon hit 


15 


334935 




CH22_FGENES.464_3 


15 


319764 


AA019827 


EST cluster (not in UniGene) 


15 


318519 


T27135 


EST cluster (not in UniGene) 


15 


332807 




CH22_F6ENES.7j9 


15 


322310 


AF086376 


EST cluster (not in UniGene) 


15 


324557 


AA489166 


Hs.156933 ESTs 


15 


332118 


AA609585 


Hs.1 62689 EST 


15 


319539 


R09027 


EST duster (not in UniGene) 


15 


313149 


AW291092 


H&201O58 ESTs 


15 


329722 




CH.14_p2gi|6065785 


15 


323514 


AA861209 


EST cluster (not in UniGene) 


15 


308078 


AW72621 


EST singleton (not in UniGene) with exon hit 


15 


337965 




CH22 EMAC0O5500.GENSCAN.10O-10 


15 


335905 




CH22_FGENES.635_13 


15 
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TABLE 14A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 14. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322064 234514J 
321409 197898J 

322092 AS678J 
321452 212379.2 
313603 199797J 
320856 36098J 



322139 
321500 
313733 
322215 
399935 

321632 
313833 
322310 
322313 



322331 
322345 
322347 
322370 
321739 
321781 
314570 
300129 
322452 
321861 
323140 
322520 
321914 
322571 
322574 
314753 
300370 



.1 

552826J 

441212J 

470Q2J 

47070J 

286374J 

120893J 

47376J 

47386J 

47434.1 

47467J 

47537J 

47545J 

187612J 

43998J 

1511778 1 

280469J 

635249J 

497108,2 

1651920J 

159551J 

38916J 

85114J 

22297J 

SMZJ 

311451J 

3910J 



322601 577912J 
322613 34330J 



316055 409389 J 
323316 981458J 
300492 25768J 



BE261397 Z78343 BE176419 AA383657 N90640 AA334052 AW955761 BE536232 AA374087 AA584776 

N71838 AA282003T54072 AA761419 H92966 AI831371 AI095435 AI690247 R99331 AW964110AA975590AA346128 

H94196C03864 

AF085833 R69689AW341677AA923375BE327566AW63(M15R69601AW615339 
AW962489 H 64300 AA329527 
AA284333 AW4681 19 AA284334 AA810992 

AB040928 T94673 AI289313 A1536039 Z44366 BE141499 D601 16 D61488 D59945 AA419503 R28090 R72986 H03255 
AI1891 12 AI912312 AW51 1018 AI401349 AW470144 C14624 AI335797 240300 AI014456 D60269 0601 15 T16722 AI370673 
D60270 

H53744 AF075088 H53797 
BE004271 AI248023 AI022157 H71999 
AA766346 AA809877 AA8361 16 AW469598 AW977404 
AF088005 N51816 N51731 

AF086106 AI193589 AW665594 N71795 AA722627 AW665373 AI300251 

AW812795 AA419617 H87827 AW299775 AW382168 AW382133 BE171659 AW392392 BE171641 AA541393 

AA766825 AA81 110) AA085906 AI762946 AW977820 

AF086376 W77804 W72689 AA837735 

AF086386 W77947 W72708 

AF086431 AA886756 AI557237 

AF086467W81444W81445 

W95298 AF086529 AI912190 AW294159 AI458747 W94782 

AF086538 W95969 AI631911 W95835 

AA330095 W25112 AA249401 

AL080280 T73124 H02689 AL030281 

D78667D78871 C18258 

AA904776 AA405696 AA405962 

AW028820AI219068 

AI1472G2W56755W56710 

N79341 N99082 N47551 

AA1 80467 AA4491 84 AA464831 AA505048 

T55958 T57205AF147346 

AA011603 N58604 N58611 

NMJM6102 AF156271 AA781868 AW152318 AW770403 AA909463 AA482996 AA758672 
AF156548 AA639797 AI675267 AI825497 AI823355 
AA463262 AA463615 AW160405 AW407583 

AW136181 AA581939 AK001221 AA694538 AA424043 AI016272 AA098960 AA884473 AI356180 BE391633 AA437086 
Ai277866 AA098827 AA992680 BE172624 AA424101 AA320776 AW962967 N77431 AW858960 AW858897 T85649 
AA357743 AI82781 7 AI905672 

AI082395 W92924 BE048524 AW005302 AI084474 AI369330 AI827710 AW135506 AW298694 

AW160507 NMJH3367 API 91338 AA384939 AI445790 AA730309 BE397003 BE267753 A1979163 N50386 AW583671 

AW583608 BE074466 BE074479 BE074471 AW976283 AA604393 AW162122 W73648 AI823475 N75898 W73713 

AW470099 AW513236 AWQ25055 AW6131 15 AI923379 W58081 AW664525 AW196795 AI143619 AI565152 AA025406 

AA505846 AI685494 AA829964 N59156 N59163 R15442 AA826919 AI610221 AI20012O AA603279 AW150822 A1189513 

AI807122 AI016368 AI335868 AW583389 AI193892 AI956157 AI628879 AW591589 AW583446 AI955406 AW148396 

A1340255 AI867942 AA748525 AA876991 Z38516 AI874002 AI869474 N63100 AA429094 AA082443 

AW105663 AA693880 AW517398AI768507 BE220851 AW978538 AA831489 

BE219300 BE327455 AL134620 R36741 R17996 

AU031709 AI249061 AA907658 AI420444 
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308362 792518J 

307783 697809J 

301161 427238 J 

324094 270098 J 

309023 4737J 



316141 423880_2 AW303457AA972713 AA724265 

323371 117336J N45114N51465 BE087338 AI083551 AL135118 BE395609 

307700 30923J1 BE280998 BE254670 BE294951 BE564979 AW405364 AA069256 M129837 AI559667 BE281405 AW410850 BE041153 

A1254811 AW301340 A181 3335 AW301 411 AI609469 A161 1607 AI611 616 AI377623 AI335509 A181 3544 BE0431 65 A1371 663 
AI340452 AI612066 AW072890 AI254558 AI349884 AI370095 AI613383 AI61 1946 AI613353 AI307414 AI318229 AI612685 
AW305327 AW268924 AI370063 AI349292 BE049068 AI369098 AW274098 A1344845 AW075187 AI053401 Al 345220 
BE138515 AI613386 A1583302 AW301955 AI349661 AI307432 A1Q54168 AI223913 AI612081 A1348942 AI334539 A1309366 
AI370098 AI252360 AW086316 AW268911 AW073482 AI379802 AI224284 AI053661 A1334538 AI309369 AI309688 A1310023 
AI492709 AI335418 AI053999 AI366989 AW073478 AI247058 A1249584 AI305875 AI308585 AW071272 AI271487 AI340719 
AI366995 AI223673 AW271066 AI611936 AW071296 AI270796 AI254385 AI251393 AI252562 AW268236 AI254858 
AW071317 AI309102 AI609897 AW268971 AI583267 AI792484 AW075168 BE138443 AI254126 A1309822 AI310872 
AI61 1953 AI251054 AW276658 At335405 AW075039 AI31 1768 AI612028 AW271895 AI612005 A1312240 AW271082 
AI371642 AI334879 AI310194 AI310772 AI345419 A1334675 AI223914 AI284707 AI284813 AI349140 AI254853 AO13094 
AI310170 AI309499 AI312476 AI376484 AI335467 A134O802 AI309815 AI310168 AI61 1446 AI345824 BE327775 AI318545 
F17185AW614950 
AW998989AI613519 
AI347274AW844024 
AA731518AA765714 

BE395109 AW663898 AW237041 AI492154 BE046906 AI651285 AI983290 AW002590 A1201040 F32424 AA992272 
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MMUU/Or 4 MMUUf 400 MIOIOOOO 


Q01Q>!n A17R9 1 
0£I040 41/0x_l 


£4yy/y uoi/uo uouioo 


o 14 loo 1/000^1 


AA7iHTA1R AAfiSAASA AAOOQQO*? 
MM/ 4UO IO MA004OO4 Mn££03£O 


320712 57156 2 


R66867 R65678 R82673 W73128 R83101 


321383 41924J 


AW968556 AJ238555 AW968731 AJ002574 AA459446 H70260 AW977557 AA767351 , 




AI300460 AA907450 AA649224T07415 AI536896 BE018515 AI279865 BE047421 


312996 187327 J 


AW368634 AI702169 A1245179 AW368646 BE545574 AA249018 AW368633 N27553 


306513 


AA989230 


306537 


AA991705 


306557 


AA994530 


306598 


AI000320 


306620 


AI000929 


306700 


A1022056 


308078 


AI472621 


306813 


AI066544 


306830 


A1075803 


306855 


A1083982 


329722 c14jj2 




329728 c14_p2 




306890 


AI092235 


308100 


AI475949 


308147 


AI498991 


306929 


AI124514 


308352 


AI610791 


308383 


AI624497 


308521 


A168S808 


308561 


AI701559 


308617 


AI738720 


308771 


A1809301 


308828 


AI824829 


308896 


AI858667 


303019 41850J 


AF098363AF098365 


303084 44211 1 


AF174008 AF174027 AF174106 


305092 AA642912 




305169 


AA663131 


305177 


AA663591 


305235 


AA670480 


305413 


AA724659 
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305849 


AAB61571 


305854 


AA862733 


307113 


AJ183886 


307130 




OU3SO/ 


rVrVOOOCvX) 




AARA79Q3 
MrtOOf loo 




Ml£*tOD 1 □ 


OvrOlO 


Alt/ HOW/ 


ow two 


M1004IOO 


OUf Of 1 


Ml 000003 


OUf 00 I 


AlOf U404 


ouf 9iae 




OUf «Wf 




0U/bO4 


AMIQfiQ? 


our yoo 


AIA01RZ1 
A14£l04l 


309245 


AI07OAA7 


onoo'7'f 


A1QQR091 
Al900££l 


ouyooo 


AWUftODl 


QAQ07O 

OU90f£ 


AVVU/40OU 


oUa4oi) 


Awuyuoo/ 


ouyouo 


A\A/1Q77/V\ 


qrinroe 

ouyooo 


AWlOloOO 


309709 


AVYi:4<iOOU 


325417 C12J1S 




025450 Cl2_hS 




•vie icn _< o kn 

325452 Ci2JiS 




309815 


A\M*MVJ7CA 

AVVZyZ/DU 


nnnoon 

309839 


AWZ9o07o 


309849 




309906 


AVV0o9o4U 


302705 31765JI 


UQ90oO UUyubl 


304037 


T26438 


304039 


T47349 




W93278 


304257 


AA053294 


304382 


AA232873 


304405 


AA282572 


304561 


AA489792 


304569 


AA490934 


304787 


AA582678 


304921 


AA603092 


327819 c_5_hs 




304968 


AA614308 


306382 


AA968967 


331263 47479J 


AW780192 AA015718 W02571 


332252 1663967J 


N63882 T91174 
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TABLE 14B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 14. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Genbank identifier (Gi) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332807 Dunham, 

332808 Dunham, 
332812 Dunham, 
332901 Dunham, 
333149 Dunham, 
333916 Dunham, 
334026 Dunham, 
334061 Dunham, 
334073 Dunham, 
334150 Dunham, 
334379 Dunham, 
334719 Dunham, 
334773 Dunham, 
334893 Dunham, 
334935 Dunham, 
335146 Dunham, 
335320 Dunham, 
335568 Dunham, 
335586 Dunham, 
335601 Dunham, 
336036 Dunham, 
336123 Dunham, 
336268 Dunham, 
337173 Dunham, 
337460 Dunham, 
337685 Dunham, 
337736 Dunham, 
337780 Dunham, 
337965 Dunham, 
337976 Dunham, 
338030 Dunham, 
338112 Dunham, 
338165 Dunham, 
338178 Dunham, 
338427 Dunham, 
338506 Dunham, 
338794 Dunham, 
338910 Dunham, 
339047 Dunham, 
332864 Dunham, 
332933 Dunham, 
333193 Dunham, 
333712 Dunham, 
333940 Dunham, 
333942 Dunham, 
334287 Dunham, 
334387 Dunham, 
334487 Dunham, 
334913 Dunham, 
335109 Dunham, 
335250 Dunham, 





Strand 

WUBI III 


1. etaL 


Plus 


1. etaL 


Pius 


1. etaL 


Plus 


1. etat 


Pius 


I. etaL 


Pius 


1. etaL 


Pius 


1. etaL 


Plus 


1. ULdl, 




1. ol.al. 


Pius 


1 atal 




1. ol.al. 


Plus 


1. etaL 


Plus 


1. etal. 


Plus 


1. etal. 


Pius 


1. etal. 


Plus 


1. etaL 


Pius 


1. etaL 


Plus 


1. etaL 


Plus 


1. etaL 


Plus 


1. etaL 


Plus 


1. etaL 


Plus 


I. etaL 


Plus 


LetaL 


Pius 


ULat 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Pius 


1. etaL 


Pius 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Plus 


LetaL 


Minus 


LetaL 


Minus 


L etaL 


Minus 


L etaL 


Minus 


1. etaL 


Minus 


LetaL 


Minus 


LetaL 


Minus 


LetaL 


Minus 


LetaL 


Minus 


LetaL 


Minus 


LetaL 


Minus 


LetaL 


Minus 



NLposition 

297686-297808 

298277-298360 

309688-310561 

1841954-1842090 

3574317^3574413 

8298994-8299169 

9196549-9196681 

9686941-9687077 

9792201-9792374 

10529221-10529854 

13908356-13908467 

15778859-15779026 

16235169-16235328 

19302753-19302881 

20108247-20108373 

21491292-21491457 

22542132-22542246 

24935021-24935655 

24990333-24990497 

25044923-25045157 

29019796-29019877 

30051089-30051186 

31997555-31998040 

23624127-23624224 

32536159-32536395 

3547161-3547245 

3850500-3850643 

41137934113990 

7034267-7034392 

7166011-7166119 

8072703-8072827 

10391398-10391600 

12205719-12205875 

12800037-12800181 

19685043-19685354 

21221871-21221953 

27114697-27114763 

28795375-28795551 

30760793-30760968 

1390386-1390296 

2035790-2035681 

3832993-3832494 

7286177-7286073 

85238304523671 

8552629-8552330 

13294116-13293871 

13946021-13945781 

14432191-14432132 

19463909-19463815 

21325792-21325667 

21952922-21952826 
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335268 Dunham, I. etal. 
335290 Dunham, I. etal. 
335549 Dunham, I. etal 
335862 Dunham, {.etal. 
5 335864 Dunham, I. etal. 
335905 Dunham, 1. etal. 
336205 Dunham, I. etal. 
336276 Dunham, I. etal. 
336433 Dunham, I. etal. 

10 336605 Dunham, I. etal. 
336616 Dunham, I. etal. 
336679 Dunham, I. etal 
337043 Dunham, I. etal. 
337272 Dunham, J. etaL 

15 337357 Dunham, I. etaL 
337393 Dunham, I. etaL 
337497 Dunham,!. etal 
337646 Dunham, I. etaL 
337920 Dunham, I. etaL 

20 338083 Dunham, I. etaL 
338220 Dunham, I. etaL 
338752 Dunham, I. etaL 
338763 Dunham, L etaL 
338983 Dunham,!. etal 

25 339209 Dunham, L etaL 
325240 5866848 
329532 3983505 
329522 3983507 
329519 3983510 

30 329511 3983514 
325326 5866875 
325303 5866908 
325389 5866921 
325417 5866925 

35 325450 5866941 
325452 5866941 
325498 5866967 
325587 6682462 
325602 5866994 

40 325701 5867028 
325780 6381953 
329722 6065785 
329728 6065785 
329666 6272129 

45 329815 6624888 
329841 6672062 
325824 5867048 
325866 5867076 
325902 5867101 

50 325958 5867142 
326014 5867160 
329941 6165199 
330002 6623963 
326154 5867170 

55 326023 5867245 
326278 5867269 
330036 6042048 
326547 5867307 
326495 5867423 

60 326507 5867435 

326505 5867435 

326506 5867435 
326530 5867441 
326508 6682496 

65 330120 6671864 
330123 6671869 
326858 6552462 
326983 5867657 
327014 5867664 



Minus 


22304275-22303770 


Mimic 


22309950-22 VWfLQI 


Minus 


24fififi203-24fififi 1 Pfl 

C*HJUU<-\AJ £H\JU>J ICQ 


Minus 


2fiRQM0O-?fifiQ01 9*5 


Minus 


aCOOSWOO f "£uO«WOO£ 


Minus 


9fiQfiRP.Rft-9flQftP.71 Q 


Minus 


OWf /*100-OU l l/#OI 1 


Mimic 

Minus 


**9nQA39fl.^9fVM1 R 1 


Mimic 

Minus 


0*Haj/ D^htOwJ/ ICO 


Minus 


IOO 1 DOLTj- IOQ IOOOO 


Mimic 

Minus 


9A0.91 fi97.9fin9flftAfl 


Minus 


OfWC7G fLO/Va CCD 1 


Minus 


1 7A(Y7'iW. 1 7A(V70t: 1 
1 /HU/OOUM / 4U/£0 1 


Minus 


9P.9il1A7ft.9P.9il1 9JY7 
CfScfk I**/ 0*202410U/ 


Minus 


ouauo i /y-ouyuo iuy 


Minus 


^1 A71 7A7.Q1 A71 ECO 


Minus 


Q W71 9.1 7.99971 9Kfl 
000/ 101 f-0o0/120O 


Minus 


Oft/4 PRGQ 9P/lfl£99 

2o4obby*2o4oDo2 


Minus 


oUol o4o-oUo 10 1 U 


Minus 


G91 O/l Oft_OH D9A1 

yoi o4oo-yjiooui 


Minus 


1410o44U- 141 001 Q4 


Minus 


26421 0/4*2642 1 1 00 


Minus 


2oo2o 1 4o-ZooZo009 


Minus 


2y9ubbbo-2y90o702 


Minus 


oo/toortca oo vino coo 
o^4yZ9oo*oZ492o9o 


Minus 


oZoUl-o2ooO 


Plus 


42yoT"4oUl4 


Minus 


OCOCC OCX CO 

ooZoo-oo4oo 


rlUS 


loWf-iooa/ 


Plus 


OrtQCC 0-400C 

20900-2 lo2o 


PIUS 


4//2o-4oU24 


Minus 


70CCC TOCOA 

7ooob-/OOoU 


PIUS 


2o^o/2-2oy7oy 


Minus 


l lUooo-llU/40 


ii; nltr 

Minus 


yf0CO^7Q_yl0CCCO 

4ooo/y-4o5oo2 


Minus 


/U41UO-/U42UZ 


PIUS 


4 TfOOTO 1TOAOA 


Plus 


•lOfSTO/4 •iOeftCT 

12o724m2o9o7 


Dine 

PIUS 


70100 7G9C1 

/yi22-/yzoi 


Minus 


729oo-7oU4o 


Oh in 

PIUS 


000O4-OO0/0 


Minus 


1 127 13m 1 2992 


Minus 


2U/044-2U7741 


PIUS 


9ojU7-9o44o 


Minus 


DO431*00/2U 


Minus 


4UlOl"4UoJl 


Minus 


424oU-42ooo 


Mini to 

Minus 


OA9£fi_04fi9P 
94000^94020 


Minus 


1 0770Q.1 97ftil0 
12//2y- 12/042 


Ditto 

PIUS 


0O4o/-OO00U 


M in tie 

Minus 


1U500MU44/ 


Minus 


9A91Q.9AA11 
O40iy*04411 


Dine 
HUS 


4ouy/-4oioo 


Minus 


71rt9-717Q 

/ iuo-/ 1 /y 


Pine 
rlUS 


171700*1 71RQR 
I / 1/99*1/ 1090 


Plus 


7C0R0.7Kqrvi 
/ sj£OU"foaiMj 


PInc 


117120-117216 


Minus 


623677-623870 


Pius 


11843-11930 


Minus 


13038-13111 


Minus 


8818*8949 


Minus 


9368-9509 


Minus 


303000-303122 


Plus 


78904-79112 


Minus 


127553-127656 


Minus 


35311-35406 


Minus 


69337-69670 


Minus 


16023-16581 


Plus 


1017630-1017788 
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326930 


6456782 


PIUS 


606950-607705 


326920 


6456782 


Minus 


42425-42519 


327058 


6531965 


Plus 


2384268*2384835 


327061 


6531965 


Minus 


3486389-3486673 


327075 


6531965 


Plus 


4041318-4041431 


327120 


6531970 


Minus 


6-1088 


330126 


6093735 


Plus 


82458-82623 


327157 


5866841 


Minus 


4408-4746 


327183 


5867442 


Plus 


84317-84531 


327192 


5867445 


Minus 


194652-194764 


327288 


5867481 


Plus 


48583-48773 


327469 


5867772 


Plus 


145549-145708 


327489 


6004459 


Minus 


57796-58015 


327526 


6381882 


Minus 


97010-97123 


327574 


5867818 


Plus 


68767-69126 


327665 


5867839 


Plus 


141736-141900 


327752 


5867949 


Plus 


9372V94421 


327819 


5867968 


Minus 

111 It IUw 


92202-92717 


327708 


5867982 


Plus 


85267-85405 


330280 

OOvcuv 


8871884 


Pius 


4520S4S269 


330282 


8671010 


Plus 


3982-4114 


328078 

0£QUf O 


5888008 


Plus 


72807-72865 


328121 


5888031 


Plus 


153782-153850 

1 %J\Jt IMm IWMMV 


O<_0 IvJU 


5888077 


Pius 


21082-21165 


328227 


5888105 


Minus 

ft III IUO 


21082-21242 


007071 


5888131 
9000 10 1 


IV ill (US 


88889-89221 


328018 


590248? 


Minus 

If III IUJ 


542547-543133 




5868246 


Minus 

rv 111 iug 


120666-120836 


328744 


5868290 


Plus 


138639-138722 


000700 


5888316 


Mini tQ 


80771-80923 


OOCQQ1 


5868363 

OOOOOOO 


Minus 

mil iuo 


144244-144434 


oopopq 


5868375 
30000/ a 


Pius 


191709-192239 


39836Q 
O&OOOs 


5R68388 
0000000 


Plus 


75371-75583 


0*10000 


cocoon c 

0000030 


Plus 


369952-370155 


3283Q7 


5868307 

OOOOO Of 


Plus 


344967.345063 




0000*HJO 


Plus 


86427-86519 


qoQcoo 


5R6R4P5 


Plus 

r JU3 


3814-4243 


32ftfi56 


vWHH f O 


Plus 


792616-792729 




6004473 


Plus 


294618-294903 


328903 


5868514 


Plus 


23625-24468 


328960 


6456775 


Plus 


38547-38837 


330320 


5932415 


Minus 


54458-54697 


328993 


5868536 


Pius 


49160-50084 


329081 


5868602 


Plus 


93368-93510 


329089 


5868614 


Plus 


25805-26923 


329109 


5868626 


Plus 


102168-102273 


329192 


5868716 


Pius 


166936-167020 


329218 


5868726 


Minus 


71408-71707 


329224 


5868728 


Plus 


27422-27664 


329246 


5868732 


Minus 


250541-250792 


329415 


5868874 


Plus 


1011438-1011818 


329454 


5868887 


Plus 


51342-51593 
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TABLE 15: 169 GENES WITH SEQUENCE INFORMATION DEPICTED IN TABLE 16 



10 



Table 15 depicts UnigenelD, UnigeneTitle, Primekey, Predicted Cellular Localization, and 
Exemplar Accession for all of the sequences in Table 16. The information in Table 15 is 
linked by EosCode to Table 16. 



Pkey: 
ExAccn: 



UnigeneTitle: 

EosCode: 

Localization: 



Unique Eos probeset Identifier number 

Exemplar Accession number, Genbank accession number 

Unigene number 

Unigene gene title 

Internal Eos name 

Predicted cellular localization of gene product 



15 Pkey ExAccn UnfgenelD UnigeneTitle 



EosCode Localization 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



100394 
100452 
101249 
101485 
101514 
101851 
102398 
102522 
102669 
103119 
103709 
104080 
104144 
104691 
105370 
106149 
106579 
107102 
107217 
108153 
109014 
109112 
109890 
110151 
112971 
113021 
114908 
114965 
116393 
116416 
117698 
117984 
118985 
119018 
119126 
120992 
121710 
121913 
122041 
122593 
123209 
124526 
126399 
126645 



127537 
128790 
129109 
129184 
129389 



084276 

D87742 

L33881 

M24736 

M28214 

M94250 

U42359 

U53347 

U71207 

X63629 

AA037316 

AA402971 

AA447439 

AA011176 

AA236476 

AA424881 

AA456135 

AA609723 

051095 

AA054237 

M156790 

AA169379 

K04649 

H18836 

T17185 

T2385& 

AA236545 

AA250737 

AA599463 

AA609219 

N41002 

N51919 

N94303 

N95796 

R45175 

AA398246 

AM19011 

AA428062 

AA431407 

AA453310 

AA489711 

N62096 

AA128075 

A1167942 

R38438 

AA569531 

AA291725 

AA491295 

W26769 

AA621604 



Hs.66052 

Hs.241552 

Hs.1904 

Hs.123072 
Hs.82045 

Hs.183556 

H&29279 

Hs5877 

Hs.13804 

HSS7771 

Hs.183390 

Hs.37744 

HJL22791 

Hs£56301 

Hs.23023 

Hs.30652 

Hs.40808 

Hs£62Q36 

Hs^57924 

HS50843 

Hs.31608 



Ks.129836 

Hs£4973 

Hs.72472 

Hs.39982 

Hs.45107 

Hs.106778 

Hs.55028 

Hs^78695 

Hs.117183 

Hs.97594 



Hs.98732 
Hs.128749 
Hs.203270 
Hs.293185 

Hs.61635 

Hs.182575 

Hs.162859 

Hs.105700 

Hs.108708 

Hs.109201 



C038 antigen (p45) PBC1 plasma membrane 

KIAA0268 protein PAB7 not determined 

protein kinase C, iota OAA1 cytoplasmic 

selectin E (endothelial adhesion molecul ACC5 plasma membrane " 
RAB3B, member RAS oncogene family PFJ2 
midklne (neurite growth-promoting factor LBH9 
gb:Human N33 protein form 1 (N33) gene, P0G3 
solute carrier family 1 (neutral amino a PFJ4 
eyes absent (Drosoph2a) homotog 2 LEM9 
cadherin 3, type 1 , P-cadherin (placenta L8G2 
hypothetical protein dJ462023.2 P006 
kallikrelnll PBA6 
hypothetical protein FU13590 POM3 
Homo sapiens beta-1 adrenergic receptor PAV1 
transmembrane protein with EGF-like and PDM9 
hypothetical protein MGC13170 PD08 
ESTs PAA4 
KIAA1344 protein PAA3 
DKFZP586E1621 protein PDG8 
ESTs PBF1 
ESTs, Weakly similar to Z223 J-iUMAN ZINC 
hypothetical protein FU 13782 BCU4 
Homo sapiens cONA RJ1 1245 fis, clone PL 
hypothetical protein FU20041 PAV9 
transmembrane, prostate androgen induced 
KIAA1028 protein PD03 
cadherirhfike protein VR20 PFJ6 
ESTs BCY2 
hypothetical protein MGC2648 PDV3 
ESTs OAB6 
ESTs PDT9 
ATPase, Ca++ transporting, type 2C, memb 
ESTs, Weakly similar to I54374 gene NF2 PDM8 
Homo sapiens prostein mRNA, complete cds 
ESTs PBF8 
KIM1210 protein PDG5 
prostate androgen-regulated transcript 1 PDV5 
ESTs; protease inhibitor 15 (PI15) BCU7 
Homo sapiens Chromosome 16 BAC clone CIT 
alpha-methylacyl-CoAracemase P0O1 
ESTs, Weakly similar to ALU1 JWMAN ALU S 
ESTs, Weakly similar to JC7328 amino ad PAV4 
transmembrane, prostate androgen Induced 
six transmembrane epithelial antigen of PAA5 
solute carrier family 15 (H-tfpeptide tra PD05 
ESTs PAA6 not determined 

secreted frizzled-related protein 4 BCX2 secreted 
caldurn/calmodulirwtependent protein kin PFJ7 
CGI-86 protein PAV6 vesicular 

spondin 2, extracellular matrix protein CJA5 not determined 



secreted 

plasma membrane 
cytoplasmic 
plasma membrane 

secreted 

plasma membrane 
plasma membrane 

plasma membrane 
not determined 

plasma membrane 
PDG7 

not determined 
PDG4 

plasma membrane 
CHA1 not determined 

plasma membrane 
mitochondrial 



ER 

PAJ5 not determined 
-PAB2 plasma membrane 



vesicular 

PAZ1 not determined 

PAA2 plasma membrane 
plasma membrane 
PDY4 

plasma membrane 



296 



WO 02/30268 



PCT/US01/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



129404 
129534 
130760 
131425 
132964 
132967 
133179 
133330 
133520 
133724 
133724 
133944 
134110 
301805 
302005 
302881 
303506 



AA172056 

R73640 Hs.11260 

AA128997 Hs.18953 

AA219134 Hs.26691 
AA031360 



ESTs 

hypothetical protein FU11264 



PAB4 
PAJ3 



AA032221 

U81599 

U42360 

X74331 

U07919 

U07919 



Hs.61635 
Hs.66731 
Hs.71119 
Hs.74519 
HSJ5746 
Hs.75746 



303753 
308050 
310382 
310431 
310573 
310598 
310816 
311596 
313676 
314121 
314691 
314785 
314907 
315051 
315052 
316442 
317548 
317869 
318524 
319191 
319763 
320324 
320561 
320796 
321441 

322782 
322818 
323226 
323287 
324295 
324430 
324603 
324617 
324626 
324658 
324718 
330211 
330546 
330762 
330790 
330892 
331099 
331490 
331889 
332247 
332396 
332697 
332798 
334447 



AA045870 Hs.7760 

U41060 HS.79136 

Al 800004 Hs.142846 

AI669666 Hs/123119 

AA508353 Hs.1 05314 

AA340605 Hs.105887 

030891 Hs.19525 

AW503733 Hs.9414 

A1460004 Hs.31608 

AI734009 Hs.127699 

AI420227 Hs.149358 

AW292180 Hs.156142 

AJ338013 Hs.140546 



AI973G51 
AI682088 



Hs^24965 
Hs.79375 



AA861697 Hs.120591 
AI732100 Hs.187619 
AW207206 Hs.136319 
AI538226 Hs.32976 
AI672225 Hs222886 
AW292425 

AA876910 Hs.134427 
AA760894 Hs.153023 
A1654187 Hs.195704 
AW295184 Hs.129142 
AW291511 Hs.159066 
AF071538 
AA460775 Hs.6295 
AF0712Q2 Hs.139336 
NM_006953Hs.15933Q 
AF038966 HsU31218 
AW297633 Hs.118498 
W07459 Hs.157601 
AA056060 Hs.202577 
AW043782 Hs.293616 
AF055019 Hs.21906 
AA639902 Hs.1 04215 
A1146686 Hs.143691 
AA464018 Hs.184598 
AW016378 Hs.292934 
AA508552 Hs.195839 
AI685464 

AI694767 Hs.129179 
AI557019 Hs.1 16467 



U31382 

AA449677 

T48536 

AA149579 

R36671 

N32912 

AA431407 

N58172 

AA340504 

T94885 



Hs299867 

Hs.15251 

Hs.122764 

Hs.91202 

Hs.14846 

HS291039 

Hs.98802 



ESTs PBA7 
ESTs PAA7 
six transmembrane epithelial antigen of PM17 
homeoboxB13 PFJ5 
Putative prostate cancer tumor suppresso PDM1 
primase, polypeptide 2A (58kD) PDM2 
aldehyde dehydrogenase 1 family, member 
aldehyde dehydrogenase 1 family, member 
Homo sapiens mRNA; cDNA DKFZp564A072 (fr 
UV-1 protein, estrogen regulated BCR4 
hypothetical protein PEU4 
MAD (mothers against decapentaplegic, DrPBJ6 
reteudn 1 (HI) PBH3 
ESTs, Weakly similar to Homolog of rat Z PEG4 
hypothetical protein FU22794 PBM4 
KIAA14S8 protein PBY3 
hypothetical protein FU20041 PEU5 
KIAA1603 protein PCQ8 
ESTs,WeaWysimilartoA46010X-linked PBH1 
ESTs PEN3 
ESTs PCW3 
ESTs PET5 
holocarboxylase synthetase (biotin-[prop PBH8 
ESTs PBY2 
ESTs PBY1 
ESTs BFF8 
guanine nucleotide binding protein 4 CB07 
ESTs, Weakly similar to TRHY_H UMAN TRICH 
ESTs PBM9 
ESTs PBJ7 
ESTs PBJ9 
ESTs PBQ6 
deoxyribonudease 11 beta PBQ7 
hypothetical protein FU10188 PBJ1 
prostate epHheflurrvspeciftcEtstransa PEN1 
ESTs,WeaWyslmiiartoT17248hypotheti PE07 
ATP-binding cassette, sub-family C (CFTR PBH5 
uroplakin3 PEL9 
secretory carrier membrane protein 1 PBY4 
Homo sapiens LUCA-15 protein mRNA, spGc 
ESTs CBF9 
Homo sapiens cONA FU12166 fis, clone MA 
ESTs PCQ7 
Homo sapiens clone 24670 mRNA sequence 
ESTs, Moderately similar to SPCN HUMAN S 
ESTs PBQ9 
Homo sapiens cONA: FU23241 fis, clone C 
ESTs PBM3 
ESTs, Weakly similar to 138022 hypotheti PBH4 
Qb1t88f04j(1 NCLCGAP_Pi28 Homo sapiens 
Homo sapiens cONA FU13581 fis, clone PL 
small nuclear protein PRAC CBK1 

PBJ2 

guanine nucleotide binding protein 4 PEW4 
hypothetical protein PBM1 
TMPRSS2, transmembrane protease, serine 
ESTs PBQ4 
Homo sapiens mRNA; cONA DKFZp564D016 (fr 
ESTs PCI4 
ESTs, Moderately similar to T14342 NSD1 PBH7 
gbza21 f09.s1 Soares fetal liver spleen PBQ5 
gb:hw31a09.x1 NCLCGAPJddll Homosapien 

PBQ8 
. PBH2 
PBY9 
PBY7 



nuclear 

plasma membrane 
plasma membrane 
nuclear 

plasma membrane 

PDT1 mitochondrial 
PDT1 mitochondrial 
PAB9 cytoplasmic 
plasma membrane 
nuclear 
cytoplasmic 
secreted 

not determined 
not determined 
plasma membrane 

plasma membrane^ 



not determined 
cytoplasmic 
PBM2not determined 

plasma membrane 



cytoplasmic 



plasma membrane 
plasma membrane 
not determined 
PBY8 not determined 



PBQ1 not determined 
plasma membrane 
PCI2 not determined 
PBJ5 

not determined 
PBY6 not determined 

cytoplasmic 
-PCW6 

PBJ4 plasma membrane 

nuclear 

not determined 

cytoplasmic 

not determined 

PEL3 plasma membrane 

plasma membrane 

PCQIcytoplasmic 

nuclear 

not determined 
nuclear 

PBJ8 not determined 

secreted 

nuclear 

not determined 

not determined 
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401424 PFG2 

407122 H20276 Hs.31742 ESTs PEW7 

408430 S79876 Hs.44926 dipeptidylpeptidase IV (CD26, adenosine PEZ3 

408826 AF216077 Hs.48376 Homo sapiens clone HB-2 mRNA sequence 

5 409262 AK000631 Hs.52256 hypothetical protefn RJ20624 PFG1 

409361 NM_005982Hs54416 sine oculis homeobox (Drosophila) homolo PEW3 

411098 U8O034 Hs.68583 mitochondrial intermediate peptidase PEZ9 

413125 BE244589 Hs.75207 gtyoxalasel PFJ3 

413623 AA825721 Hs.246973 ESTs OBH6 

10 414422 AA147224 Hs.337232 HomeoboxA13 PFC6 

415263 AA948033 Hs.130853 ESTs PEZ5 

417153 X57010 Hs.81343 "collagen, type II, alpha 1 (primary ost PFJ1 

418601 AA279490 Hs.86368 calmegin PFA1 

418848 AI820961 Hs.193465 ESTs PEY4 

15 418882 NMJXM996HS.89433 ATP-blnding cassette, sub-family C (CFTR OBH2 

419839 U24577 Hs.93304 "phosphoRpase A2, group VII (ptatelet-a PFH9 

421887 AW161450 Hs.109201 CGI-86 protBin PFH2 

422083 NMJJ01141HS.111256 'arachidonate 15-Epoxygenase, second ty PFH5 

424565 AW1Q2723 Hs.75295 guanylatB cyclase 1 , soluble, alpha 3 PFA3 

20 425071 NMJ313989HS.154424 Ukxlinase, iodothyronine, type li" PFH6 

425710 AF030880 solute carrier family, member 4 PFD4 

427958 AA4 18000 Hs.98280 potassium intermediate/small conductance PFH1 

428819 AL135623 Hs.193914 KIAA0575 gene product PFD6 

429900 AA460421 Hs.30875 ESTs PEZ7 

25 429918 AW873986 Hs.119383 ESTs PEY5 

430226 BE245562 H&2551 adrenergic, beta-2-, receptor, surface PEZ4 

431217 NMJ013427H&250830 Rho GTPase activating protein 6 PFG6 

431716 D89053 Hs.268012 fatty-ackJ-Coenzyme A ligase, long-chain PEZ1 

431992 NlvL002742Hs2891 protein kinase C, mu PFH4 

30 432189 AA527941 gb:nh30c0As1 NCLCGAP_Pr3 Homo sapiens 

432244 AI669973 Hs.200574 ESTs PEW8 

432437 W07088 H&293885 ESTs PFG3 

432966 AA650114 Hs.325198 ESTs PEY3 

439176 AI446444 Hs. 190394 ESTs, Weakly similar to B28096 line-1 pr PEWS 

35 440260 AI972867 Hs.7130 copinalV PEW6 

440901 AA909358 Hs.128612 ESTs PFC8 

445424 AB028945 cortactin SH3 domain-binding protein PEZ6 

446320 AF126245 Hs.14791 "acyl-Coenzyme A dehydrogenase family, m 

447210 AF035269 phospriafldylserine^rwcitlcphosphol^as PFH8 

40 449156 AF103907 Hs.171353 prostate cancer antigen 3, non-coding DD PEZ8 

449625 NMJH4253 odz (odd Oz/ten-m, Drosophila) homolog 1 PEZ2 

449650 AF055575 Hs23838 calcium channel, voltage-dependent L ty PFD2 

451939 U80456 Hs.27311 single-minded (Orosophila) homolog 2 PFJ8 

451982 F13036 H&27373 Homo sapiens mRNA; cDNA DKFZp56401763 (f 

45 452039 AI922988 ESTs PFD8 

452340 NM_002202Hs505 ISL1 transcripfion factor, UM/homeodoma PFG4 

452784 BE463857 Hs.151258 hypothetical protein FU21 062 PFC5 

452946 X95425 Hs31092 EphAS PFH3 



mitochondrial 

plasma membrane 

PEY1 

nuclear 

nuclear 

mitochondrial 

cytoplasmic 



ER 

secreted 
plasma m 
cytoplasmic 

secreted 

plasma membrane 
plasma membrane 
nuclear 



plasma membrane 
nuclear 

cytoplasmic 
PFA2 



PFH7 



plasma membrane 
plasma membrane 

PFG9plasma membrane 

nuclear 
cytoplasmic 
plasma membrane 
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TABLE 15A shows the accession numbers for those primekeys lacking a unigenelD in Table 
15, For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



132964 
129389 



116393 131543J AI972402 A1634409 AI523716 AJ789749 W44518 A1424438 AI688513A1971048AI686324 AW013854AA588483 AA528111 AI627428 
A1582200 AI669296 AI826926 AI620526 AI669958 A1972458 AI924500 AA512903 W44517 AA335363 AW238997 BE300165 
BE250665 AA284195 AA523420 W52834 AI471970 AI952824 AW003820 AW009463 AA669796 AA114966 AI653342 AA1 15038 
AI342150 AI092100 AI968211 W51994AI804005 AI201420 AI123210 AI738405 AI674964 AI970341 AW02750OAI493316AI333193 
AI139353 AA599463AI656163A1804200 AI365321 AI990213 AI657011 AA650025 AI968810 AI341978 AA599839 AW592602 
AA644289 AI468578 AI565265 A1565228 BE221535 AW973052 
101485 181 13 J AA296520 AL021940 M3064O NMJXJ0450 M24736 M61894 AL047443 H39560 AI694691 AA916787 AI214796 AA939085 AI150616 

AA412553 AA412545 AI051015 T276S4 AA694430 
126399 17331J AA088767 AF224278 AA128075 AL035541 AA027926 AI761441 AI972096 AW071693 AI742327 AI377498 AI804815 AI640802 

AI885001 AI921394 AAS9S1 15 N71820 AI921217 AW007283 AI467828 AI369306 AA917446 AI493698 AA088701 AA126899 AI936228 
AW204238 AI039567 AI925027 BE138909 AW452945 AW135998 AA310984 AA027860 AW073519 AI537597 AA953976 AI521341 
AW273569 AW050740 AA536113 AA559064 AI474392 AW1 35709 AA535181 AW572959 AA570597 A1905464 AI677810 AI587642 
AW975102 AA424310AA482527N64192AA658276AW889117AA486591 AW889172 AI381990 AI381991 AI673419 AI990950 
AA487031 AI272934 AI150565 AA229168 AW316722 AI142707 BE222396 AA6141 68 AA122G26 AW338227 AA632457 AI968726 
AW369662 AA512956 AA541675 AA451748 A1250993 BE146418 AA122025 
94346J AI362575 AI805082 AW263421 AI432462 AA135870 AA031360 AA031604 AA298475 AA298464 

21074.1 NM.012445 AB027466 BE407510 BE047605 AA047125 AW084003 AA149494 AA149490 AA292528 AA570505 AA526186 AW006250 
AW007762 A1341557 AI799666 AI972710 AI377966 AI962810 AI084783 AI458032 AI190971 AW148913 AA372354 AW970032 
AW007426 AA650188 AI123203 A1122890 A1280975 W73595 W73495 AI863238 AA374109 AA603986 AW149089 AW957523 
AI307748 AI921067 AI336463 F24537 AI380460 AI367500 AI189309 AI814701 AI766921 AW572106 AA037024 AW072576 AA578293 
A1288103 AA235464 AW450642 AA574230 AW294024 A1589229 AI560733 AW512227 AA877009 AI66Q255 AW188597 AA558228 
AI572782 AA658397 AI274628 AI866359 AA864573 AI264439 AA621604 AW515493 AW243333 Z39737 AI567038 AA573997 
AA573559 AW236431 A1652870 A! 634973 AA034505 AA047126 
156454 J A1267700 AI720344 AA191424 AI023543 A1469633 AA172056 AW958465 AA1 72236 AW953397 AA355086 
8836J AL080235 AA031750 081382 AI480231 AI095947 AI560953 BE010721 AI870290 AA374945 AA125792 D51527 D51556 AI685541 
D51559 AW1 17286 AA195741 AI675138 AW593439 AI201885 T30590 AW952100 D51095 AA523864 W70043 AA987586 AI421515 
AI205532 AA127069 AI337367 D51595 AI453785 AW075677 AW088359 C14287 C14284 

AF163474 NMJM6590 AF163475 AI761105 AI770098 AA410580 AA411616 AI590343 AI739Q50 AL0501 98 AI862645 AA419104 
AA513809 AA333032 AI816915 AW139625 AA640889 AI311391 AI627693 AW135514 AA41901 1 AI269149 AI245259 AI970008 
AJ970017 AW139445 AA569503 AI761072 AI766179 AI759995 A1300776 AI870129 AW150770 AA226501 AA226220 . 
AI249368 AI742316 AA428062 AA442089 AI864189 BE349478 A1603475 AI584049 BE552085 AI088609 AI2641 97 AI886144 AI129474 
AJ307145 BE181 300 AW058403 A1696838 AW748598 AA442196 AI216428 
entrez_U42359U42359 

347217J AW292425 BE467167 AI702953 BE550961 BE222309 AI299348 AI693336 AA541708 
33641 1J AI685464 AW971336 AA513587 AA525142 

16065J NMJJ12391 AF071538AB031549 AI685592At745526AA662204AW130657AA662164 AW971121 AI668916 AA513274 AI991223 
A1979170 AW298436 AA639821 A185901O AW513942 AI687669 AA662521 AA548598 AI345056 AI305374 BE043418 AI432856 
AI334840 AI379796 AI492693 AI30791 5 BE042082 A1307834 AI307858 AI309488 BE042210 AI435670 AI371605 AI862491 A1284563 
AI306872 AI255044 AI254601 AI251236 AI473073 AI473042 AI432760 AI435664 AI336826 AI289365 A1369096 AI862274 AI334871 
AI349863 AI250405 AI377617 A1309895 AI313017 AI862291 AI31 1936 AI378718 AI305722 AI306769 AI308888 AI334565 A1862296 
AI344230 AI435685 AI344087 AI378696 AI31 1209 AI435775 AI310611 AI311154AI432289 AI431561 AI492681 AI432867 AI335288 
AI492796 A1432769 AI310299 AI432273 AI379820 AI275319 AI435753 AI609441 AI432767 AI369100 AI31 1420 AI349974 AI247157 
AJ334677 A1270910 AI224320 AI305608 AI334489 AI377152 AI350012 AI370086 A1335053 AI306761 AI306750 AI334849 AI334874 
AI340380 AI307876 AJ305974 AI30S972 AI311521 AI334872 AI862509 AI31 1498 AI335051 A1289684 AI310859 AI31 1862 AI862483 
A1492775 AI307906 AI492708 AI269693 AI340373 AI307910 AI31 1359 AI435653 AI334865 AI31 1492 AI492809 AI492690 AM31576 
AI862268 A1311679 AI308435 AI492792 AI862512 AI275321 AI431568 AI431564 AI307885 AI307926 AI435692 AI435778 AI310182 
AI308894 AI492707 AI492713 AI308560 AI307829 A1343234 AI580598 AW472796 AI340918 AI310243 AI309368 AI307920 AI289665 ( 
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121913 291015J 
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AI306777 AW086318 AW086292 AW086378 AI310027 A1275293 AI389082 AI340900 AI306749 AI371558 AW086287 BE043803 
AI306793 AI306272 AI287948 AI270917 AI284816 A1336813 A1284546 AI308044 AI275290 AI270872 AI306795 AI289687 AI223570 
A1305303 A1289677 A1287742 A1275284 A1306812 AI336701 AI371554 A1378719 AI344988 AI223631 AI335141 AI343222 A1284568 
AI305357 AI275270 A1345932 AI436549 AI307925 AI311502 A1344238 AI343182 AI308508 AI305988 AI270790 AI379792 AI305647 
AI305410 AI432251 AI436517 A1343227 AI3Q5534 A1340387 AI271043 AI305499 A1271046 AI3G5962 AI289465 AI305378 A1289725 
AI310848 AI305848 A1289362 A1252964 A1307049 AI310831 AI306993 AI306796 AI224659 AI305969 AI349855 AI306164 Al 306948 
AI284676 A1309155 AI343202 AI432785 AI306815 AI369081 AI270885 AI289699 A1435704 AI309647 AI305716 AI311281 A1287927 
AW72995 AI340423 AI270958AI307069 AI305364 AI270807AI275306AI311890AI275263 AI432750 AI289371 AM32S61 A1255113 
AI305709 A1473008 AI311168 AI309711 AI377164 A1271201 A1289560 AI309710 AI306195 A1311201 A1267741 AI271066 A1432876 
AI275281 AI379795 AI472972 AI31 1967 AI306826 AI305465 AI270792 AI473019 AI305340 AI270922 AI305995 AI305462 AI254144 
A1270969 AI473012 A1305390 AI275278 A1223644 A1289692 A1250318 AI305372 AI289691 AI250521 AI306283 AI306814 AI307933 
AI4731 60 AJ432903 AI223720 A1254979 AI334862 AI306926 AI289541 AI432248 A1435722 AI435698 AJ432859 A1310683 AI4731 75 
AI335144 A1289467 AI436489 Al 306928 AI473033 A1305763 AI307868 AI307882 A1348959 AI435736 AI432857 AI432896 A1435735 
AI432283 AI473086 A1432863 AI473081 AI432825 AI307840 A1473164 AI432885 A1473166 AI472982 AI435734 A1473060 AI473171 
AI432279 AI432882 AI334670 AI4365 12 AI432827 AI432852 AI473051 AI473077 A1435697 AI271509 A1492781 AI472983 A1473018 
AI432897 AJ473043 AI432871 AI436536 AI473157 A(349715A1432777 AI473016 AI473158 AI340369 A1307941 AJ432773 AI377146 
AI492791 A1270950 A1305342 AI284604 A1306269 AI28481 1 A1270811 AI289347 AI334869 AI334852 AI311759 A1250382 AI309520 
A1289550 At305721 A1340870 AI2709 01 AI308575 AI307904 AI340715 AI270941 AI309808 AI246867 AI473014 AI307039 AI289360 
AI473069 A1492786 A1344013 AI305876 A1436510 AI340742 A1473028 AI307891 BE041871 BE041268 BE042340 BE041 946 
BE041763 A1306173 A1201948 A1926972 AI275769 

338265 CH22_6856FG_UNK_EM ACO0 

330211 Cjjfi 

332798 CH22J4FG_6_5_UNK_C4G1.G 
334447 CH22J746FGL387_7JJN1CEM 

332247 372969J AA669097 AA51 3815 AA026798 AA676526 AA704429 AA704269 AW1 18292 AA57921 6 N58172 

332396 20265J AW579842 BE156562 BE156690 BE156489 BE081033 AKQ01559 BE149402 M85387 AW367811 AW367798 R17370 AI908947 

AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 R53463 H1 1063 AW068542 Z40761 BE176212 BE176155 
W23952 W92188 AW374883 AA303497 AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI0781 61 
BE463983 AI805213 AI761264 W94885 N94502 AI623772 AI419532 A1810302 AI634190 AW0Q2516 AW150777 AI352312 AI367474 
AW204807 AI675502 AI337026 AW134715 BE328451 AI123157 AI560020 AI300745 AI608631 AI248873 AA742484 AW051635 
H18646 AI245045 AA507111 AI64051 0 AI925594 AA1 15747 AA1 43035 AA151 106 

332697 13699.1 X51405 NMJB1873 T1 1322 AL1 18886 BE328175 AW136009 BE467445 AW470313 AA774852 BE504139 AW501046 AA082792 
AW389231 AA370044 R36841 AA371457 C04813 R25791 R25556 AW895854 AW903819 AW895871 AW895677 BE159723 
AW895664 AW895597 AW895595 AW895665 AW888518 AI903724 F06081 F08503 AL1 19462 AW895730 AW888516 R2651 1 
R26489 AA334126 AA327626 N85713 AW895998 AA223622 F05468 AA370749 W05590 M78202 AA371073 AW498607 R15017 
T16991 AA001282 AA001 138 AA551566 AA330159 AJ922855 AA383512 AA029603 D82246 D821 71 T94933 H56545 AA348060 
AA176888 R96764 AW451817 AA385766 AA452618 AI690057 AA988822 BE549928 AA150S01 W57992 AW899925 C05281 
AA932042 AA370980 AW962877 W04741 AA369982 AW385948 AA922466 N75882 A1422070 A1361256 A168Q224 D57122 T94885 
R53266 R46713T19071 AW796277 AA325333 F04719 F02334 AA358146 AA626597 AA358304 AW028099 AL1 19570 D57290 
D58273 057796 N46555 AI361969 AAS23457 D57225 AW024046 AA992606 AW0221 16 AW021538 AA935845 H69870 K56546 
AW961219 AA453239 AW837541 N45521 BE218Q29 AA318877 AA327740 AW961609 T92139 D53216 D52365 D53363 053312 
D53116 AI547267 AA679935 AW026552 AW026418 AW1 90507 AI927710 AW244108 D50948 AW054991 AW021063 AW022511 
AA493436 AI365636 BE464761 AW149384 AA102442 AW771368 AI818251 AI126368 D51049 AI421542 AI559467 AW079779 
AW021048AW023969AW(M4214AI458264AA027274AI620254AW02^^^ AA326242 N67561 A1971273 AA878328 

D57131 AA770662 A1309299 A1796767 AA613338 W58076 AI566287 A1445573 AJ 880260 AA00191 9 AW339259 A1492610 A149261 1 
R97692 Al 301 425 AA722603 058361 AI350323 AA973926 AI431263 AA5161 26 AA865467 A1925177 N39443 AA001943 A1299371 
AI082412 AA665090 AA583433 H89871 AA977231 AJ 3522 19 AI056096 A1270446 N67524 N22103 AW614224 AA744054 AW243622 
AI613188 AI929173 AI350243 AI362138 AA744004 AA176661 056787 AI955625 AI393109 A1094769 A1479728 AI423107 AI955617 
AI034036 AI582196 AW264534AI418961 AA570761 A!343538 AA650341 AA992503 AA770004 AL039666 AI862675 AW1 90335 
AA610274 AW418627 BE467472 056786 T28749 AI217610 A1359556 T23523 AL040189 AA846222 AA651636 D51280 AI888986 
AI521 167 AI340177 AW612815 A1625285 AA621 607 AA177059 AA229768 AA829788 AI749682 AW190631 N75299 AA230089 
A1915632 BE069542 AA890020 AA528397 AA995390 BE503860 AA570812 AW339396 AM 97986 AI203725 A1282379 AA670375 
AA461 513 F01728 AW243599 C00856 N75567 R95995 AA150932A95961 AA648060 AA933800 AA927073 AA101 126 AA864190 
T93566BE167472 

425710 25529 1 AF030880 NM.000441 AC002467 AA385554 H23053 AW891836 Al 139968 AA653057 A1695233 
432189 342819.1 AA527941 AI810608 A1620190AA635266 

445424 6391J AB028945 T77648 F13328 AL157605 Z46212 AA304736 F1 1855 T66098 T30174 AW954164 AW176301 AW748243 AA456428 
AI369958 AA938565 AW959613 242008 AA994779 AI683909 F1 1019 F10926 AI769597 AI752550 T65015 AJ884314 AA643954 
Z41838 AW020147 AI038822 AW571822 AA299781 AA894928 AF131790 BE00541 1 AI902476 AW082695 AA464384 R42750 
AW902301 AA464273 R05837 Z38294 H41098 AL134507 M86079 

447210 7119J AF035269 AF035268NM_015900T96213U37591 AA156832 AA299371 AI084325H95977AI765967BE221465 AA156726 AI969563 
AW024539 AI436791 AI949451 AA843093 AI452756 AA824232 AJ306667 T96131 AW207447 AW243556 AW957032 AI084332 
H95978 U30998 

449625 8113J NM.014253 AF100772 BE088769 AL022718 BE161779 AW863569 BE161640 AL039060BE168542 AW296554 AA323193 AA235370 
AW779760 N48674 AI375997 R45432 059344 AI2031O7 F07491 R35360 R25094 AI913631 A1498402 T61382 AI016320 N45526 
T61415AA331486 

452039 89513J AI922988H05475AA021608 AW169947AA913750Z41614 AW800012 
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PO7US01/32045 



TABLE 15B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 15. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. 'Dunham I. et a!" refers to the 

publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al. t Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted axons. 



Pkey Ref 


Strand 


NLposition 


334447 Dunham, 1. eta!. 


Plus 


14308764-14308824 


332798 Dunham, I. etal. 


Minus 


232147-231974 


338255 Dunham, 1. etal. 


Minus 


15242294-15242231 


330211 6013592 


Plus 


59158-59215 


401424 8176894 


Plus 


24223-24428 
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TABLE 1 1 AND SEQUENCE LISTING 

SEQIDN0:1 6CU4 ONA SEQUENCE 

Nucleic Add Accession*: NM.Q24915 

Coding sequence: 13-1690 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 SI 

1 I I I 1 1 ^ 
ATTGGATCAA ACATGTCACA AGAGTCGGAC AATAATAAAA GACTAGTGGC CTTAGTGCCC 60 

ATGCCCAGTG ACCCTCCATT CAATACCCGA AGAGCCTACA CCAGTGAGGA TGAAGCCTGG 120 

AAGTCATACT TGGAGAATCC CCTGACAGCA GCCACCAAGG CCATGATGAT CATTAATGGT 180 

GATGAGGACA GTGCTGCTGC CCTCGGCCTG CTCTATGACT ACTACAAGGT TCCTCGAGAC 240 

AAGAGGCTGC TGTCTGTAAG CAAAGCAAGT GACAGCCAAG AAGACCAGGA GAAAAGAAAC 300 

TGCCTTGGCA CCAGTG AAGC CCAGAGTAAT TTG AGTGG AG G AG AAAACCG AGTGCAAGTC 360 

CTAAAGACTG TTCCAGTGAA CCTTTCCCTA AATCAAGATC ACCTGGAGAA TTCCAAGCGG 420 

GAACAGTACA GCATCAGCTT CCCCGAGAGC TCTGCCATCA TCCCGGTGTC GGGAATCACG 480 

GTGGTGAAAG CTGAAGATTT CACACCAGTT TTCATGGOCC CACCTGTGCA CTATCCCCGG 540 

GGAGATGGGG AAGAGCAACG AGTGGTTATC TTTGAACAGA CTCAGTATGA CGTGCCCTCG 600 

CTGGCCACCC ACAGCGCCTA TCTCAAAG AC GACCAGCGCA GCACTCCGGA CAGCACATAC 660 

AGCGAGAGCT TCAAGGACGC AGCCACAGAG AAATTTCGGA GTGCTTCAGT TGGGGCTGAG 720 

GAGTACATGT ATGATCAGAC ATCAAGTGGC ACATTTCAGT ACACCCTGGA AGCCACCAAA 780 

TCTCTCCGTC AGAAGCAGGG GGAGGGCCCC ATG ACCTACC TCAACAAAGG ACAGTTCTAT 840 

GCCATAACAC TCAGCGAGAC CGGAGACAAC AAATGCTTCC GACACCCCAT CAGCAAAGTC 900 

AGG AGTGTGG TGATGGTGGT CTTCAGTGAA GACAAAAACA GAGATGAACA GCTCAAATAC 960 

TGGAAATACT GGCACTCTCG GCAGCATACG GCGAAGCAGA GGGTCCTTGA CATTGCCGAT 1020 

TACAAGGAGA GCTTTAATAC GATTGGAAAC ATTGAAGAGA TTGCATATAA TGCTGTTTCC 1080 

TTTACCTGGG ACGTGAATGA AGAGGCGAAG ATTTTCATCA CCGTGAATTG CTTGAGCACA 1140 

GATTTCTCCT CCCAAAAAGG GGTGAAAGGA CTTCCTTTGA TGATTCAGAT TGACACATAC 1200 

AGTTATAACA ATCGTAGCAA TAAACCCATT CATAG AGCTT ATTGCCAGAT CAAGGTCTTC 1260 

TGTGACAAAG GAGCAGAAAG AAAAATCCGA GATGAAGAGC AGAAGCAGAA CAGGAAGAAC 1320 

GGGAAAGGCC AGGCCTCCCA AACTCAATGC AACAGCTCCT CFGATGGG AA GTTGGCTGOC 1380 

ATACCTTTAC AG AAGAAGAG TGACATCACC TACTTCAAAA CCATGCCTGA TCTCCACTCA 1440 

CAGCCAGTTC TCTTCATACC TGATGTTCAC TTTGCAAACC TGCAGAGGAC CX3GACAGGTG 1500 

TATTACAACA CGGATG ATGA ACGAGAAGGT GGCAGTGTCC TTGTTAAACG GATGTTCCGG 1560 

CCCATGGAAG AGGAGTTTGG TCCGGTGCCT TCAAAGCAGA TGAAAGAAGA AGGGACAAAG 1620 

CGAGTGCTCT TGTACGTGAG GAAGGAGACT GACGATGTGT TCGATGCATT GATGTTGAAG 1680 

TCTCCCACAG TGATGGGCCT GATGGAAGCG ATATCTGAGA AATATGGGCT GCCCGTGGAG 1740 

AAGATAGCAA AGCTTTACAA GAAAAGCAAA AAAGGCATCT TGGTGAACAT GGATGACAAC 1800 

ATCATCGAGC ACTACTOGAA OGAGGACAOC TTCATOCTTCA ACATGGAGAG CATGGTGGAG 1860 

GGCTTCAAGG TCACGCTCAT GGAAATCIA0 CCCTGGGTTT GGCATCCGCT TTGGCTGGAG 1920 

CTCTCAGTGC GTTCCTCCCT GAGAGAGACA GAAGCCCCAG CCCCAGAACC TGGAGACCCA 1980 

TCTCCCCCAT CTCACAACTG CTGTTACAAG ACCGTGCTGG GGAGTGGGGC AAGGGACAGG 2040 

CCCCACAGTC GGTGTGCTTG GCCCATCCAC TGGCACCTAC CACGGAGCCG AAGCCIGAGC 2100 

CCCTCAGG AA GGTGCCTTAG GCCTGTTGG A TTCCTATTTA TIGCCCACCT TTTCCTGGAG 2160 

CCCAGGTCCA GGCCCGCCAG GACTCTGCAG GTCACTGCTA GCTCCAGATG AGACXXjTCCA 2220 

GCGTTCCCCC TTCAAGAGAA ACACTCATCC CGAACAGCCT AAAAAATTGC CATCCCTTCT 2280 

TTCTCACOCC TCCATATCTA TATCTOCCGA GTGGCTGG AC AAAATGAGCT AOGTCTGGGT 2340 

GCAGTAGTTA TAGGTGGGGC AAGAGGTGGA TCCCCACTTT CTGGTCAGAC ACCTTTAGGT 2400 

TGCTCTGGGG AAGGCTGTCT TGCTAAATAC CTCCAGGGTT CCCAGCAAGT GGCCACCAGG 2460 

CCTTGTACAG GAAGACATTC AGTCAOCGTG TAATTAGTAA CACAGAAAGT CTGOCTGTCT 2520 

GCATTGTACA TAGTGTTTAT AATATTGTAA TAATATATTT TACCTGTGGT ATGTGGGCAT 2580 

GTTTACTGCC ACTGGCCTAG AGGAGACACA G ACCTGGAG A CCGTTTTAAT GGGGGTTTTT 2640 

GCCTCTGTGC CTGTTCAAGA GACTTGCAGG GCTAGGTAGA GGGCCTTTGG GATGTTAAGG 2700 

TG ACTGCAGC TGATGCCAAG ATGGACTCTG CAATGGGCAT ACCTGGGGGC TCGTTCCCTG 2760 

TCCCCAG AGG AAGCCCCCTC TCCTTCTCCA TGGGCATG AC TCTCCTTOGA GGCCACCACG 2820 

TTTATCTCAC AATG ATGTGT TTTGCCTG AC TTTCCCTTTG CGCTGTCTCG TGGGAAA GGT 2880 

CATTCTGTCT GAG ACCCCAG CTCCTTCTCC AGCTTTGGCT GCGGGCATGG CCTGAGCTTT 2940 

CTGGAGAGCC TCTGCAGGGG GTTTGCCATC AGGGCCCTGT GGCTGGGTCT GCTGCAGAGC 3000 

TCCTTGGCTA TCAGGAGAAT CCTGGACACT GTACTGTGCC TCCCAGTTTA CAAAQACGCC 3060 

CTTCATCTCA AGTGGCCCTT TAAAAGGCCT GCTGCCATGT GAGAGCTGTG AACAGCTCAG 3120 

CTCTGAGTCG GCAGACTGGG GCTTCCTCCT GGGCCACCAG ATGGAAAGGG GGTATTGTTT 3180 

GCCTCACTCC TGGATGCTGC GTTTTAAGGA AGTGAGTGAG AAAGAATGTG CCAAGATACC 3240 

TGGCTCCTGT GAAACCAGCC TCAGG AGGGA AACTGGGAGA GAGAAGCTGT GGTCTCCTGC 3300 

TACATGCCCT GGGAGCTGGA AGAGAAAAAC ACTCCCCTAA ACAATCGCAA AATG ATG AAC 3360 

CATC ATG GGC CACTGTTCTC TTTGAGGGGA CAGGTTTAGG GGTTTGCGTT CGCCCTTGTG 3420 

GGCTGAAGCA CTAGCTTTTT GGTAGCTAGA CACATCCTGC ACCCAAAGGT TCTCTACAAA 3480 

GGCCCAGATT TGTTTGTAAA GCACTTTGAC TCTTACCTGG AGGCCCGCTC TCTAAGGGCT 3540 

TCCTGCGCTC CCACCTCATC TGTCCCTGAG ATGCAGAGCA GGATGGAGGG TCTGCTTCTA 3600 

GCTCAGCTGT TTCTCCTTGA GGTTGCGGAG GAATTGAATT GAATGGGACA GAGGGCAGGT 3660 

GCTGTGGCCA AG AAGATCTC CGAGCAGCAG TGACGGGGCA CCTTGCTGTG TGTCCTCTGG 3720 

GCATGTTAAC CCTTCTGTGG GGCCAAAGGT TTGCATCGTG GATCCAGCTG TGCTCCAGTC 3780 

TGTCCCCTCC TCCTCCACTC TGACTGCCAC GCCCCGGACC AGCAGCTTGG GGACCCTCCA 3840 

GGGTACTAAT GGGGCTCTGT TCTGAGATGG ACAAATTC AG TGTTGGAAAT ACATGTTGTA 3900 

CTATGCACTT CCCATGCTCC TAGGGTTAGG AATAGTTTCA AACATGATTG GCAGACATAA 3960 

CAACGGCAAA TACTCGGACT GGGGCATAGG ACTCCAGAGT AGGAAAAAGA CAAAAGATTT 4020 

GGCAGCCTG A CAC AGGCAAC CTACCCCTCT CTCTCCAGCC TCTTTATGAA ACTGTTTGTT 4080 

TGOCAGTOCT GCCCTAAGGC AGAAGATGAA TTGAAGATGC TGTGCATGTT TCCTAAGTCC 4140 

TTGAGCAATC ATGGTGGTGA CAATTGCCAC AAGGGATATG AGGCCAGTGC CACCAGAGGG 4200 
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TGGTGCCAAG TGCCACATCC CTTCCGATCC ATTCCCCTCT GTATCCTCGG AGCACCCCAG 4260 
TTTGCCTTTG ATGTGTCCGC TGTGTATGTT AGCTGAACTT TGATG AGCAA AATTTCCTGA 4320 
GCGAAACACT CCAAAGAGAT AGG AAAACTT GCCGCCTCIT CTTTTTTGTC CCTTAATCAA 4380 
ACTCAAATA A GCTTAAAAAA AATCCATGG A AGATCATGGA CATGTGAA AT GAGCATTTTT 4440 
TTCTTTTCTT TTTTTTTTTI TTTTTTTAAC AAAGTCTGAA CTGAACAGAA CAAGACTTTT 4500 
TCCTCATACA TCTCCAAATT GTTTAAACTT ACTTTATG AG TGTTTGTTTA GAAGTTCGGA 4560 
CCAACAGAAA AATGCAGTCA GATGTCATCT TGG AATTGGT TTCTAAAAGA GTAAGGCATG 4620 
TCCCTGCCCA G AAACTTAGG AAGCATG AAA TAAATC AAAT GTTTATTTTC CTTCTTATTT 4680 
AAAATCATGC TAATGCAACA GAAATAGAGG GTTTGTGCCA AATGCTATGA ACGGCCCTTT 4740 
CTTAAAGACA AGCAAGGGAG ATTGATATAT GTACAATTTG CTCTCATGTT TIT 



1 11 21 31 41 51 
I I I I I I 

MSQESDNNKR LVALVPMPSD PPFNTRRAYT SEDEAWKSYL ENPLTAATKA MMUNGDEDS 60 
AAALGLLYDY YKVPRDKRLL SVSKASDSQE DQEKRNCLGT SEAQSNLSGG ENRVQVLKTV 120 
PVNLSLNQDH LENSKREQYS ISFPESSAII PVSGITWKA EDFTPVFMAP PVHYPRGDGE 180 
EQRWDFEQT Q YD VPS LATH SAYLKDDQRS TPDSTYSESF KDAATEKFRS AS VGAEEYMY 240 
DQTSSGTFQY TLEATKSLRQ KQGBGPMTYL NKGQFYAITL SETGDNKCFR HKSKVRS W 300 
MWFSEDKNR DEQLKYWKYW HSRQHTAKQR VUDIADYKES FNTIGNIEEI AYNAVSFTWD 360 
VNEEAKOTT VNCLSTDFSS QKGVKGLPLM IQIDTYSYNN RSNKPIHRAY CQIKVFCDKG 420 
AERKIRDEEQ KQNRKNGKGQ ASQTQCNSSS DGKLAAIPLQ KKSDITYFKT MPDLHSQPVL 480 
FIPDVHFANL QRTGQVYYNT DDEREGGSVL VKRMJFRPMEE EFGPVPSKQM KEEGTKRVLL 540 
YVRKETDDVF DALMLKSPTV MGLMEAISEK YGLPVEKIAK LYKKSKKGIL VNMDDNHEH 600 
YSNEDTFILN MESMVEGFKV TLMEI 



SEQ ID N0:3 BCU7 ONA SEQUENCE VARIANT 1: 

Nucleic Add Accession #: AA428082 

Coding sequence: 1-777 (entire sequence represents open reading frame) 



1 11 21 31 41 51 

I I I I I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC OCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATGCT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAA ATAA 

SEQ 10 N0:4 BCU7DNA SEQUENCE VARIANT 2: 

Nucleic Acid Accession*: AA428062 

Cooing sequence: 1-777 (entire sequence represents open reading frame) 



1 11 21 31 41 51 

I I I I I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATACT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ ID N0:S BCW PfPttin, figqtfence Variant 1 
Protein Accession #: none 

1 11 21 31 41 51 

I I I I I I 

MCAISAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 
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YISQNDMIAI LDYHNQVRGK VFPPAANMEY MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA FPYPQDCNPR CPMRCFGPMC THYTQMVWAT 180 

SNRIGCAIHA CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWFK 

SEQ ID NQ:6 PQV7 Proton WQV60W VflriflU ft 
Protein Accession*: none 



l 11 21 

I I I 

MIAISAVSSA LLFSLLCEAS TWLLNSTDS 
YISQNDMIAI LDYHNQVRGK VFPPAANMEY 
LGQNLSVRTG RYRSILQLVK PWYDEVKDYA 
SNRIGCAIHT CQNMNVWGSV WRRAVYLVCN 
TDNLCFPGVT SNYLYWFK 



31 41 51 

I I I 

SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 

MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

FPYPQDCNPR CPMRCFGPMC THYTQMVWAT 180 

YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 



SEQ ID N&7 BCX2 DNA SEQUENCE 

Nucleic Acid Accession «: NMJW3014 

Coding sequence: 238-1278 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GGCGGGTTCG CGCCCCGAAG GCTGAGAGCT GGCGCTGCTC GTGCCCTGTG TGCCAGACGG 60 
CGGAGCTCCG CGGCCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 
AAACTCTCCT GCGCCCCAGA AGATTTCTTC CTCGGCGAAG GGACAGCGAA AGATGAGGGT 180 
GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC GGGGTCGCAG CGCGAGAGGG CAGTGCCATQ 240 
TTCCTCTCCA TCCTAGTGGC GCTGTGCCTG TGGCTGCACC TGGCGCTGGG CGTGCGCGGC 300 
GCGCCCTGCG AGGCGGTGCG CATCCCTATG TGCCGGCACA TGCCCTGGAA CATCACGCGG 360 
ATGCCCAACC ACCTGCACCA CAGCACGCAG GAGAACX3CCA TCCTGGCCAT CGAGCAGTAC 420 
GAGGAGCTGG TCGACGTGAA CTGCAGCGCC GTGCTGCGCTT C I 1C 1 1 CTG TGCCATGTAC 480 
GCGCCCATTT GCACCCTGGA GTTCCTGCAC GACCCTATCA AGCCGTGCAA GTCGGTGTGC 540 
CAACGCGCGC GCGACGACTG CGAGCCCCTC ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 
AGCCTGGCCT GCGACG AGCT GCCTGTCTAT GACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 
ATCGTCACGG ACCTCCCGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 720 
CAGGAAAGGC CTCTTGATGT TGACTGTAAA CGCCTAAGCC (XGATCGGTG CAAGTGTAAA 780 
AAGGTGAAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGCC 840 
AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGGTCA CAACGGTGGT GGATGTAAAA 900 
GAGATCTTCA AGTCCTCATC ACCCATCCCT CG AACTCAAG TCCCGCTCAT TACAAATTCT 960 
TCTTGCCAGT GTCCACACAT CCTGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 
CGTTCAAGGA TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 
AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 
AAGAAAACAG CCGGGCGCAC CAGTCGTAGT AATCOCCCCA AACCAAAGGG AAAGCCTCCT 1200 
GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA G AAGAG AACA 1260 
AACOCGAAAA GAGTG32AGC TAACTAGTTT CCAAAGCGGA GACTTCCGAC TTCCTTACAG 1320 
GATGAGGCTG GGCATTGCCT GGGACAGCCT ATGTAAGGCC ATGTGCCCCT TGCOCTAACA 1380 
ACTCACTGCA GTGCTCTTCA TAG ACACATC TTGCAGCATT TTTCTTAAGG CTATGGTTCA 1440 
GTTTTICI FT GTAAGCCATC ACAAGCCATA GTGGTAGGTT TGCCCTTTGG TACAGAAGGT 1500 
GAGTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA GAGTAACCTG TGTGCATACT 1560 
CTAG AAGAGT AGGG AAAATA ATGCTTGTTA CAATTCGACC TAATATGTGC ATTGTAAAAT 1620 
AAATGCCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 
TATCTGTTGT TGCAATGTTA GTG ATGTTTT AAAATGTG AT GAAAATATAA TGTTTTTAAG 1740 
AAGG AACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 
'11 111 GTG AT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 
TGTGTTTTTT TACCAATG AC TTCAGTTTCT GTTTTTAGCT AGAAACTTAA AAACAAAAAT 1920 
AATAATAAAG AAAAATAAAT AAAAAGG AG A GGCAGACAAT GTCTGGATTC CTGTTTTTTG 1980 
GTTACCTGAT TTCCATGATC ATGATGCTTC TTGTCAACAC CCTCTT AAGC AGCACCAGAA 2040 
ACAGTGAGTT TGTCTGTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 
ATTTTATACC CACAAGAG AG GTATGTCACT CATCTTACTT CCCAGG ACAT CCACCCTG AG 2160 
AATAATTTG A CAAGCTTAAA AATGGCCTTC ATGTG AGTGC CAAATTTTGT TTTTCTTCAT 2220 
TTAAATATTT TCTTTGCCTA AATACATGTG AG AGGAGTTA AATATAAATG TACAGAGAGG 2280 
AAAGTTGAGTTCCACCTCTG AAATG AG AAT TACTTGAC AG TTGGGATACT TTAATCAGAA 2340 
AAAAAG AACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 
ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAGCAT GTATTTTATA 2460 
AGGCATTCAA TAAATGCACA ACGCCCAAAG 'GAAATAAAAT CCTATCTAAT CCTACTCTCC 2520 
ACTACACAG A GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2580 
GCACTTATAA AATGATTTGA ACAAATAAAA CTAGG AACCT GTATACATGT GTTTCATAAC 2640 
CTGCCTCCTT TGCTTGGCCC TTTATTGAG A TAAGTTTTCC TGTCAAGAAA GCAGAAACC A 2700 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGC AT TACTCAAC AA ACTGTTGTGC 2760 
TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC ACCGGAATTC 



SEQ ID N0:8PQX? Proton, sequence; 
Protein Accession*: NP.003005.1 

1 11 21 31 41 51 
I I I I I I 

MFLSELVALC LWLHLALGVR GAPCEAVRIP MCRHMPWNIT RMPNHLHHST QENAILAIEQ 60 
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YEELVDVNCS AVLRFFFCAM YAPICTLEFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 
ESLACDELPV YDRGVCISPE AIVTDLPEDV KWOITPDMM VQERPLDVDC KRLSPDRCKC 180 
KKVKPTLATY LSKNYSYVIH AKIKAVQRSG CNEVTTWDV KEIFKSSSPI PRTQVPUTN 240 
SSCQCPHILP HQDVUMCYE WRSRMMLLEN CLVEKWRDQL SKRSIQWEER LQEQRRTVQD 300 
KKKTAGRTSR SNPPKPKGKP PAPKPASPKK NIKTRSAQKR TNPKRV 

SEQ 10 NO* C6K1 DNA SEQUENCE 

Nucleic Add Accession #: NM.032391 

Coding sequence: 129-302 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GTCCTTCCTC TCCTAGCCTA AGGCGTGCAA ACAGAGCGCC ACTGGGAGGC TGAAACCTTT 60 

AGGCCGATGC TTGCTTGCAA GGTCAGGCAA GCTGGATTCT GGTCCCCACC TTTGCAGAGA 120 

GAACAGC GAT GT TGTGCGCC CATTTCTCAG ATCAAGGACC GGCCCATCTT ACTACCTCCA 180 

AGAGTGCTTT TCTCTCTAAT AAGAAAACAT CTACTTTGAA ACATCTACTG GGCGA6ACCA 240 

GGAGTGATGG CTCAGCCTGT AATTCTGGAA TTTCGGGAGG CCGAGGCAGG AAGATTCCTT 300 

GAGCACAGGA GTTCCAGACC AGCCTGGGCA ATGTAGCAAG ACGCTGTCTC TATTTATACA 360 
ATAAAATTTT TTTAAAAAAG G 



SEQ tDN0:1QCpK1 Protein sequence; 
Protein Accession #: NP_1 1 5767 

l 11 21 31 41 51 

I I I I I I 

MLCAHFSDQG PAHLTTSKS A FLSNKKTSTL KHLLGETRSD GSACNSGISG GRGRKIP 

SEQ 10 K0:11 CHA1 DNA SEQUENCE 

Nucleic Acid Accession #: NMJJ20182 

Coding sequence: 96-854 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

TCCTTGGGTT CGGGTGAAAG CGCCTGGGGG TTCGTGGCCA TGATCCCCGA GCTGCTGGAG 60 

AACTGAAGGC GGACAGTCTC CTGCGAAACC AGGCAATGGC GGAGCTGGAG TTTGTTCAGA 120 

TCATCATCAT CGTGGTGGTG ATGATGGTGA TGGTGGTGGT GATCACGTGC CTGCTGAGCC 180 

ACTACAAGCT GTCTGCACGG TCCTTCATCA GCCGGCACAG CCAGGGGCGG AGGAGAGAAG 240 

ATGCCCTGTC CTCAGAAGGA TGCCTGTGGC CCTCGGAGAG CACAGTGTCA GGCAACGGAA 300 

TCCCAGAGCC GCAGGTCTAC GCCCCGCCTC GGCCCACCGA CCGCCPGGCC GTGCCGCCCT 360 

TCGCCCAGCG GGAGCGCTTC CACCGCTTCC AGCCCACCTA TCCGTACCTG CAGCACGAGA 420 

TCGACCTGCC ACCCACCATC TCGCTGTCAG ACGGGGAGGA GCCCCCACCC TACCAGGGCC 480 

CCTGCACCCT CCAGCTTCGG GACCCCGAGC AGCAGCTGGA ACTGAACCGG GAGTCGGTGC 540 

GCGCACCCCC AAACAGAACC ATCTTCGACA GTGACCTGAT GGATAGTGCC AGGCTGGGCG 600 

GCCOCTGOCC CCCCAGCAGT AACTCGGGCA TCAGCGCCAC GTGCTACGGC AGCGGCGGGC 660 

GCATGGAGGG GCCGCCGCCC ACCTACAGCG AGGTCATCGG CCACTACCCG GGGTCCTCCT 720 

TCCAGCACCA GCAGAGCAGT GGGCOGCCCT CCTTGCTGGA GGGGACCCGG CTCCACCACA 780 

CACACATCGC GCCCCTAGAG AGCGCAGCCA TCTGGAGCAA AGAGAAGGAT AAACAGAAAG 840 

GACACCCTCT CTAGGGTCCC CAGGGGGGCC GGGCTGGGGC TGCGTAGGTG AAAAGGCAGA 900 

ACACTCCGCG CTTCTTAGAA GAGGAGTGAG AGGAAGGCGG GGGGCGCAGC AACGCATCGT 960 

GTGGCCCTCC CCTCCCACCT CCCTGTGTAT AAATATTTAC ATGTGATGTC TGGTCTGAAT 1020 

GCACAAGCTA AGAGAGCTTG CAAAAAAAAA AAGAAAAAAG AAAAAAAAAA ACCACGTTTC 1080 

TTTGTTGAGC TGTGTCTTGA AGGCAAAAGA AAAAAAATTT CTACAGTAAA AAAAAAAAAA 1140 
A 

SEQ ID HO:12 CHA1 Protein seouence: 
Protein Accession #: NP.064567 

1 11 21 31 41 51 

I I I I I I 

HAELEFVQII IXVWMMVMV WITCLLSHY KLSARSFISR HSQGRRREDA LSSEGCLWPS 60 
ESTVSGNGIP EPQWAPPRP TDRLAVPPFA QRERFHRFQP TYPYLQHEID LPPTISLSDG 120 
EEPPPYQGPC TLQLRDPEQQ LELNRESVRA PPNRTIFDSD LMDSARLGGP CPPSSNSGIS 180 
ATCYGSGGRM EGPPPTYSEV IGHYPGSSFQ HQQSSGPPSL LEGTRLHHTH IAPLESAAIW 240 
SKEKDKQKGH PL 

SEQ (0 N0:13 CJA5 DNA SEQUENCE 

Nucleic Acid Accession #: NM.01244S 

Coding sequence: 276-1271 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 
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10 



GCACGAGGGA AGAGGGTGAT OCGACCCGGG GAAGGTCGCT GGGCAGGGCG AGTTGGGAAA 60 

GCGGCAGCCC OCGCCGCCOC CGCAGCCCCT TCTCCTCCTT TCTCCCACGT CCTATCTGCC 120 

TCTCGCTGGA GGCCAGGCCG TGCAGCATCG AAGACAGGAG GAACTGGAGC CTCATTGGCC 180 

GGCCCGGGGC GCCGGCCTCQ GGCTTAAATA GGAGCTCCGG GCTCTGGCTO GGACCCGACC 240 

GCTGCCGGCC GCGCTCCCGC TGCTCCTGCC GGGTGATGGA AAACCCCAGC CCGGCCGCCG 300 

CCCTGGGCAA GGCCCTCTGC GCTCTCCTCC TGGCCACTCT CGGCGCCGCC GGCCAGCCTC 360 

TTGGGGGAGA GTCCATCTGT TCCGCCAGAG CCCCGGCCAA ATACAGCATC ACCTTCACG6 420 

GCAAGTGGAG CCAGACGGCC TTCCCCAAGC AGTACCCCCT GTTCCGCCCC CCTGCGCAGT 480 

GGTCTTCGCT GCTGGGGGCC GCGCATAGCT CCGACTACAG CATGTGGAGG AAGAACCAGT 540 

ACGTCAGTAA CGGGCTGCGC GACTTTGCGG AGCGCGGCGA GGCCTGGGCG CTGATGAAGG 600 

AGATCGAGGC GGCGGGGGAG GCGCTGCAGA GCGTGCACGC GGTGTTTTCG GCGCCCGCCG 660 

TCCCCAGCGG CACCGGGCAG ACGTCGGCGG AGCTGGAGGT GCAGCGCAGG CACTCGCTGG 720 

TCTCGTTTGT GGTGCGCATC GTGCCCAGCC CCGACTGGTT CGTGGGCGTG GACAGCCTGG 780 

ACCTGTGCGA CGGGGACCGT TGGCGGGAAC AGGCGGCGCT GGACCTGTAC CCCTACGACG 840 

15 CCGGGACGGA CAGCGGCTTC ACCTTCTCCT CCCCCAACTT CGCCACCATC CCGCAGGACA 900 

CGGTGACCGA GATAACGTCC TCCTCTCCCA GCCACCCGGC CAACTCCTTC TACTACCCGC 960 

GGCTGAAGGC CCTGCCTCCC ATCGCCAGGG TGACACTGGT GCGGCTGCGA CAGAGCCCCA 1020 

GGGCCTTCAT CCCTCCCGCC CCAGTCCTGC CCAGCAGGGA CAATGAGATT GTAGACAGCG 1080 

CCTCAGTTCC AGAAACGCCG CTGGACTGCG AGGTCTCCCT GTGGTCGTCC TGGGGACTGT 1140 

21) GCGGAGGCCA CTGTGGGAGG CTCGGGACCA AGAGCAGGAC TCGCTACGTC CGGGTCCAGC 1200 

CCGCCAACAA CGGGAGCCCC TGCCCCGAGC TCGAAGAAGA GGCTGAGTGC GTCCCTGATA 1260 

ACTGCGT CTA AG ACCAGAGC OCCGCAGCCC CTGGGGCCCC CGGAGCCATG GGGTGTCGGG 1320 

GGCTCCTGTG CAGGCTCATG CTGCAGGCGG CCGAGGCACA GGGGGTTTCG CGCTGCTCCT 1380 

GACCGCGGTG AGGCCGCGCC GACCATCTCT GCACTGAAGG GCCCTCTGGT GGCCGGCACG 1440 

GGCATTGGGA AACAGCCTCC TCCTTTCCCA ACCTTGCTTC TTAGGGGCCC CCGTGTCCCG 1500 

TCTGCTCTCA GCCTCCTCCT CCTGCAGGAT AAAGTCATCC CCAAGGCTCC AGCTACTCTA 1560 

AATTATGGTC TCCTTATAAG TTATTGCTGC TCCAGGAGAT TGTCCTTCAT CGTCCAGGGG 1620 

CCTGGCTCCC ACGTGGTTGC AGATACCTCA GACCTGGTGC TCTAGGCTGT GCTGAGCCCA 1680 

CTCTCCCGAG GGCGCATCCA AGCGGGGGCC ACTTGAGAAG TGAATAAATG GGGCGGTTTC 1740 

30 GGAAGCGTCA GTGTTTCCAT GTTATGGATC TCTCTGCGTT TGAATAAAGA CTATCTCTGT 1800 
TGCTCAC 



35 SEQ ID N0:14 CJA5 Protein sequence: 
Protein Accession*: NP_036577 

1 11 21 31 41 51 

4U MENPSPAAAL GKALCALLLA TLGAAGQPLG GESICSARAP AKYSITFTGK WSQTAFPKQY 60 

PLFRPPAQWS SLLGAAHSSD YSMWRKNQYV SNGLRDFAER GEAWALMKEI EAAGEALQSV 120 

HAVFSAPAVP SGTGQTSAEL EVQRRHSLVS FWRIVPSPD WFVGVDSLDL CDGDRWREQA 180 

ALDLYPYDAG TDSGFTFSSP NFATIPQDTV TEITSSSPSH PANSFYYPRL KALPPIARVT 240 

LVRLRQSPRA FIPPAPVLPS RDNBIVDSAS VPETPLDCEV SLWSSWGLCG GHCGRLGTKS 300 
RTRYVRVQPA NNGSPCPELE EEABCVPDNC V 



25 



45 



SEQ ID NO:15 IBH9 DNA SEQUENCE 

Nucleic Acid Accession*: NMJJ02391 
50 Coding sequence: 26-457 (underilned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

55 CGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 

CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGCGteTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 360 

60 CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 420 

AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 480 

GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 

65 TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 720 

ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



70 



SEQ ID K0:16 LBHgpr^nseqyence; 
Protein Accession*: NP^002382 



1 11 21 31 41 51 

„ I I I I I I 

iJ MQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSBCAEW AWGPCTPSSK DCGVGFREGT 60 

CGAQTQRIRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA RYNAQCQETI 120 

RVTKPCTPKT KAKAKAKKGK GKD 
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SEQ ID N&17 LEM9 DNA SEQUENCE 

Nudelc Add Accession #: NM.005244 

Coding sequence: M 61 7 (underlined sequences correspond to start and stop codons) 



l 11 21 31 41 51 

I III!! 

ATGGTAGAAC TAGTGATCTC ACCCAGCCTC ACTGTAAACA GCGATTGTCT GGATAAACTG 60 

AAGTTTAACC GTGCTGACGC TGCTGTGTGG ACTCTGAGTG ACAGACAAGG CATCACCAAA 120 

TCGGCCCCCC TGAGAGTGTC CCAGCTCTTC TCCAGATCTT GCCCACGTGT CCTCCCCCGC 180 

CAGCCTTCCA CAGCCATGGC AGCCTACGGC CAGACGCAGT ACAGTGCGGG GATCCAGCAG 240 

GCTACCCCCT ATACAGCTTA CCCACCTCCA GCACAAGCCT ATGGAATCCC TTCCTACAGC 300 

ATCAAGACAG AAGACAGCTT GAACCATTCC CCTGGCCAGA GTGGATTCCT CAGCTATGGC 360 

TCCAGCTTCA GCACCTCACC CACTGGACAG AGCCCATACA CCTACCAGAT GCACGGCACA 420 

ACAGGGTTCT ATCAAGGAGG AAATGGACTG GGCAACGCAG CCGGTTTCGG GAGTGTGCAC 480 

CAGGACTATC CTTCCTACCC CGGCTTCCCC CAGAGCCAGT ACCCCCAGTA TTACGGCTCA 540 

TCCTACAACC CTCCCTACGT CCCGGCCAGC AGCATCTGCC CTTCGCCCCT CTCCACGTCC 600 

ACCTACGTCC TCCAGGAGGC ATCTCACAAC GTCCCCAACC AGAGTTCCGA GTCACTTGCT 660 

GGTGAATACA ACACACACAA TGGACCTTCC ACACCAGCGA AAGAGGGAGA CACAGACAGG 720 

CCGCACCGGG CCTCCGACGG GAAGCTCCGA GGCCGGTCTA AGAGGAGCAG TGACCCGTCC 780 

OCGGCAGGGG ACAATGAGAT TGAGCGTGTG TTCGTGTGGG ACTTGGATGA GACAATAATT 840 

ATTTTTCACT CCTTACTCAC GGGGACATTT GCATCCAGAT ACGGGAAGGA CACCACGACG 900 

TCCGTGCGCA TTGGCCTTAT GATGGAAGAG ATGATCTTCA ACCTTGCAGA TACACATCTG 960 

TTCTTCAATG ACCTGGAGGA TTGTGACCAG ATCCACGTTG ATGACGTCTC ATCAGATGAC 1020 

AATCGCCAAG ATTTAAGCAC ATACAACTTC TCCGCTGACG GCTTCCACAG TTCGGCCCCA 1080 

GGAGCCAACC TGTGCCTGGG CTCTGGCGTG CACGGCGGCG TGGACTGGAT GAGGAAGCTG 1140 

GCCTTCCGCT ACCGGCGGGT GAAGGAGATG TACAATACCT ACAAGAACAA CGTTGGTGGG 1200 

TTGATAGGCA CTCCCAAAAG GGAGACCTGG CTACAGCTCC GAGCTGAGCT GGAAGCTCTC 1260 

ACAGACCTCT GGCTGACCCA CTCCCTGAAG GCACTAAACC TCATCAACTC CCGGCCCAAC 1320 

TGTGTCAATG TGCTGGTCAC CACCACTCAA CTAATTCCTG CCCTGGCCAA AGTCCTGCTA 1380 

TATGGCCTGG GGTCTGTGTT TCCTATTGAG AACATCTACA GTGCAACCAA GACAGGGAAG 1440 

GAGAGCTGCT TCGAGAGGAT AATGCAGAGA TTCGGCAGAA AAGCTGTCTA CGTGGTGATC 1500 

GGTGATGGTG TGGAAGAGGA GCAAGGAGCG AAAAAGCACA ACATGCCTTT CTGGCGGATA 1560 
TCCTGCCACG CAGACCTGGA GGCACTGAGG CACGCCCTGG AACTGG AGTA TTT ATAG 

SEQ P N0;18 1£M? Efltt sequence: 
Protein Accession #: NP_005235 



1 11 21 31 41 51 

I I I I I I 

MVELVISPSL TVNSDCLDKL KFNRADAAVW TLSDRQGITK SAPLRVSQLP SRSCPRVLPR 60 

QPSTAHAAYG QTOYSAGIQQ ATPYTAYPPP AQAYGIPSYS HCTEDSLNHS PGQSGFLSYG 120 

SSFSTSPTGQ SPYTYOMHGT TGPYQGGNGL GNAAGFGSVH QDYPSYPGFP QSQYPQYYGS 180 

• SYNPPYVPAS SICPSPLSTS TYVLQEASHN VPNQSSESLA GEYNTHNGPS TPAKEGDTDR 240 

PHRASDGKLR GRSKRSSDPS PAGDNEIERV FVWDLDETII IFHSLLTGTF ASRYGKDTTT 300 

SVRIGLMMEE HIFNLADTHL FFNDLEDCDQ IHVDDVSSDD NGQDLSTYNF SADGFHSSAP 360 

GANLCLGSGV HGGVDWMRKL AFRYRRVKEM YMTYKNNVGG LIGTPKRETW LQLRAELEAL 420 

TDLWLTHSLK ALNLINSRPN CVNVLVTTTQ LIPALAKVLL YGLGSVPPIE NIYSATKTGK 480 
ESCFERIKQR FGRKAVYWI GDGVEEEQGA KKHNMPFWRI SCHADLEALR EALELEYL 

SEQ 10 N0:19 0AA1 DNA SEQUENCE 

Nuctefc Add Accession*: NM_002740 

Coding sequence: 178-1968 (underiined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CCGCGGTTCC GGCTGCTCCG GCGAGGCGAC CCTTGGGTCG GCGCTGCGGG CGAGGTGGGC 60 

AGGTAGGTGG GCGGACGGCC GCGGTTCTCC GGCAAGCGCA GGCGGCGGAG TCCCCCACGG 120 

CGCCCGAAGC GCCCCCCGCA CCCCCGGCCT CCAGCGTTGA GGCGGGGGAG TGAGGAGATG 180 

CCGACCCAGA GGGACAGCAG CACCATGTCC CACACGGTCG CAGGCGGCGG CAGCGGGGAC 240 

CATTCCCACC AGGTCCGGGT GAAAGCCTAC TACCGCGGGG ATATCATGAT AACACATTTT 300 

GAACCTTCCA TCTCCTTTGA GGGCCTTTGC AATGAGGTTC GAGACATGTG TTCTTTTGAC 360 

AACGAACAGC TCTTCACCAT GAAATGGATA GATGAGGAAG GAGACCCGTG TACAGTATCA 420 

TCTCAGTTGG AGTTAGAAGA AGCCTTTAGA CTTTATGAGC TAAACAAGGA TTCTGAACTC 480 

TTGATTCATG TGTTCCCTTG TGTACCAGAA CGTCCTGGGA TGCCTTGTCC AGGAGAAGAT 540 

AAATCCATCT ACCGTAGAGG TGCACGCCGC TGGAGAAAGC TTTATTGTGC CAATGGCCAC 600 

ACTTTCCAAG CCAAGCGTTT CAACAGGCGT GCTCACTGTG CCATCTGCAC AGACCGAATA 660 

TGGGGACTTG GACGCCAAGG ATATAAGTGC ATCAACTGCA AACTCTTGGT TCATAAGAAG 720 

TGCCATAAAC TCGTCACAAT TGAATGTGGG CGGCATTCTT TGCCACAGGA ACCAGTGATG 780 

CCCATGGATC AGTCATCCAT GCATTCTGAC CATGCACAGA CAGTAATTCC ATATAATCCT 840 

TCAAGTCATG AGAGTTTGGA TCAAGTTGGT GAAGAAAAAG AGGCAATGAA CACCAGGGAA 900 

AGTGGCAAAG CTTCATCCAG TCTAGGTCTT CAGGATTTTG ATTTGCTCCG GGTAATAGGA 960 

AGAGGAAGTT ATGCCAAAGT ACTGTTGGTT CGATTAAAAA AAACAGATCG TATTTATGCA 1020 

ATGAAAGTTG TGAAAAAAGA GCTTGTTAAT GATGATGAGG ATATTGATTG GGTACAGACA 1080 

GAGAAGCATG TGTTTGAGCA GGCATCCAAT CATCCTTTCC TTGTTGGGCT GCATTCTTGC 1140 

TTTCAGACAG AAAGCAGATT GTTCTTTGTT ATAGAGTATG TAAATGGAGG AGACCTAATG 1200 

TTTCATATGC AGCGACAAAG AAAACTTCCT GAAGAACATG CCAGATTTTA CTCTGCAGAA 1260 
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ATCAGTCTAG CATTAAATTA TCTTCATGAG CGAGGGATAA TTTATAGAGA TTTGAAACTG 1320 

GACAATGTAT TACTGGACTC TGAAGGCCAC ATTAAACTCA CTGACTACGG CATGTGTAAG 1380 

GAAGGATTAC GGCCAGGAGA TACAACCAGC ACTTTCTGTG GTACTCCTAA TTACATTGCT 1440 

CCTGAAATTT TAAGAGGAGA AGATTATGGT TTCAGTGTTG ACTGGTGGGC TCTTGGAGTG 1500 

CTCATGTTTG AGATGATGGC AGGAAGGTCT CCATTTGATA TTGTTGGGAG CTCCGATAAC 1560 

CCTGACCAGA ACACAGAGGA TTATCTCTTC CAAGTTATTT TGGAAAAACA AATTCGCATA 1620 

CCACGTTCTC TGTCTGTAAA AGCTGCAAGT GTTCTGAAGA GTTTTCTTAA TAAGGACCCT 1680 

AAGGAACGAT TGGGTTGTCA TCCTCAAACA GGATTTGCTG ATATTCAGGG ACACCCGTTC 1740 

TTCCGAAATG TTGATTGGGA TATGATGGAG CAAAAACAGG TGGTACCTCC CTTTAAACCA 1800 

AATATTTCTG GGGAATTTGG TTTGGACAAC TTTGATTCTC AGTTTACTAA TGAACCTGTC 1860 

CAGCTCACTC CAGATGACGA TGACATTGTG AGGAAGATTG ATCAGTCTGA ATTTGAAGGT 1920 

TTTGAGTATA TCAATCCTCT TTTGATGTCT GCAGAAGAAT GTGTCTGATC CTCATTTTTC 1980 

AACCATGTAT TCTACTCATG TTGCCATTTA ATGCATGGAT AAACTTGCTG CAAGCCTGGA 2040 

TACAATTAAC CATTTTATAT TTGCCACCTA CAAAAAAACA CCCAATATCT TCTCTTGTAG 2100 

ACTATATGAA TCAATTATTA CATCTGTTTT ACTATGAAAA AAAAATTAAT ACTACTAGCT 2160 

TCCAGACAAT CATGTCAAAA TTTAGTTGAA CTGGTTTTTC AGTTTTTAAA AGGCCTACAG 2220 
ATGAGTAATG AAGTTACCTT TTTTGTTTAA AAAAAAAAAA G 



SEQ ID NO:20 OAA1 Protein sequence: 
Protein Accession #: NP_002731 

l n 21 31 41 51 

I I I I I I 

MSHTVAGGGS GDHSHQVRVK AYYRGDIMIT HFEPSISFEG LCNEVRDMCS FBNEQLFTMK 60 
WIDEEGDFCT VSSQLELEEA FRLYELNKDS ELLIKVFPCV PERPGMPCPG EDKSIYRRGA 120 
RRWRKLYCAN GHTFQAKRFN RRAHCAICTD RIWGLGRQGY KCINCKLLVH KKCHKLVTIE 180. 
CGRHSLPQEP VMPMDQSSMH SDHAQTVT PY NPSSHESLDQ VGEEKEAMNT RESGKASSSL 240 
GLQDFDLLRV IGRGSYAKVL LVRLKKTDRI YAMKWKKEL VNDDEDIDWV QTEKHVFEQA 300 
SNHPFLVGLH SCFQTESRLF FVIEYVNGGD LMFHHQRQRK LPEEHARFYS AEISLALNYL 360 
HERGIIYRDL KLDNVLLDSE GHIKLTDYGM CKEGLRPGDT TSTFCGTPNY IAPEILRGED 420 
YGFSVDWWAL GVLMFEMMAG RSPFDIVGSS DNPDQNTEDY LFQVILEKQI RIPRSLSVKA 480 
ASVLKSFLNK DPKERLGCHP QTGFADIQGH PFFRNVDWDM MEQKQWPPF KPNI SGEFGL 540 
DNFDSQFTNE PVQLTPDDDD IVRKIDQSEF EGFEYINPLL HSAEECV 



SEQ tD N0:21 0BH2 DNA SEQUENCE 

Nucleic Acid Accession*: 105628 

Cooing sequence: 197*4792 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CCAGGCGGCG TTGCGGCCCC GGCCCCGGCT CCCTGCGCCG CCGCCGCCGC CGCCGCCGCC 60 

GCCGCCGCCG CCGCCGCCAG CGCTAGCGCC AGCAGCCGGG CCCGATCACC CGCCGCCCGG 120 

TGCCCGCCGC CGCCCGCGCC AGCAACCGGG CCCGATCACC CGCCGCCCGG TGCCCGCCGC 180 

CGCCCGCGCC ACCGGCATGG CGCTCCGGGG CTTCTGCAGC GCCGATGGCT CCGACCCGCT 240 

CTGGGACTGG AATGTCACGT GGAATACCAG CAACCCCGAC TTCACCAAGT GCTTTCAGAA 300 

CACGGTCCTC GTGTGGGTGC CTTGTTTTTA CCTCTGGGCC TGTTTCCCCT TCTACTTCCT 360 

CTATCTCTCC CGACATGACC GAGGCTACAT TCAGATGACA CCTCTCAACA AAACCAAAAC 420 

TGCCTTGGGA TTTTTGCTGT GGATCGTCTG CTGGGCAGAC CTCTTCTACT CTTTCTGGGA 480 

AAGAAGTCGG GGCATATTCC TGGCCCCAGT GTTTCTGGTC AGCCCAACTC TCTTGGGCAT 540 

CACCACGCTG CTTGCTACCT TTTTAATTCA GCTGGAGAGG AGGAAGGGAG TTCAGTCTTC 600 

AGGGATCATG CTCACTTTCT GGCTGGTAGC CCTAGTGTGT GCCCTAGCCA TCCTGAGATC 660 

CAAAATTATG ACAGCCTTAA AAGAGGATGC CCAGGTGGAC CTGTTTCGTG ACATCACTTT 720 

CTACGTCTAC TTTTCCGTCT TACTCATTCA GCTCGTCTTG TCCTGTTTCT CAGATCGCTC 780 

ACCCCTGTTC TCGGAAACCA TCCACGACCC TAATCCCTGC CCAGAGTCCA GCGCTTCCTT 840 

CCTGTCGAGG ATCACCTTCT GGTGGATCAC AGGGTTGATT GTCCGGGGCT ACCGCCAGCC 900 

CCTGGAGGGC AGTGACCTCT GGTCCTTAAA CAAGGAGGAC ACGTCGGAAC AAGTCGTGCC 960 

TGTTTTGGTA AAGAACTGGA AGAAGGAATG CGCCAAGACT AGGAAGCAGC CGGTGAAGGT 1020 

TGTGTACTCC TCCAAGGATC CTGCCCAGCC GAAAGAGAGT TCCAAGGTGG ATGCGAATGA 1080 

GGAGGTGGAG GCTTTGATCG TCAAGTCCCC ACAGAAGGAG TGGAACCCCT CTCTGTTTAA 1140 

GGTGTTATAC AAGACCTTTG GGCCCTACTT CCTCATGAGC TTCTTCTTCA AGGCCATCCA 1200 

CGACCTGATG ATGTTTTCCG GGCCGCAGAT CTTAAAGTTG CTCATCAAGT TCGTGAATGA 1260 

CACGAAGGCC CCAGACTGGC AGGGCTACTT CTACACCGTG CTGCTGTTTG TCACTGCCTG 1320 

CCTGCAGACC CTCGTGCTGC ACCAGTACTT CCACATCTGC TTCGTCAGTG GCATGAGGAT 1380 

CAAGACCGCT GTCATTGGGG CTGTCTATCG GAAGGCCCTG GTGATCACCA ATTCAGCCAG 1440 

AAAATCCTCC ACGGTCGGGG AGATTGTCAA CCTCATGTCT GTGGACGCTC AGAGGTTCAT 1500 

GGACTTGGCC ACGTACATTA ACATGATCTG GTCAGCCCCC CTGCAAGTCA TCCTTGCTCT 1560 

CTACCTCCTG TGGCTGAATC TGGGCCCTTC CGTCCTGGCT GGAGTGGCGG TGATGGTCCT 1620 

CATGGTGCCC GTCAATGCTG TGATGGCGAT GAAGACCAAG ACGTATCAGG TGGCCCACAT 1680 

GAAGAGCAAA GACAATCGGA TCAAGCTGAT GAACGAAATT CTCAATGGGA TCAAAGTGCT 1740 

AAAGCTTTAT GCCTGGGAGC TGGCATTCAA GGACAAGGTG CTGGCCATCA GGCAGGAGGA 1800 

GCTGAAGGTG CTGAAGAAGT CTGCCTACCT GTCAGCCGTG GGCACCTTCA CCTGGGTCTG 1860 

CACGCCCTTT CTGGTGGCCT TGTGCACATT TGCCGTCTAC GTGACCATTG ACGAGAACAA 1920 

CATCCTGGAT GCCCAGACAG CCTTCGTGTC TTTGGCCTTG TTCAACATCC TCCGGTTTCC 1980 

CCTGAACATT CTCCCCATGG TCATCAGCAG CATCGTGCAG GCGAGTGTCT CCCTCAAACG 2040 

CCTGAGGATC TTTCTCTCCC ATGAGGAGCT GGAACCTGAC AGCATCGAGC GACGGCCTGT 2100 

CAAAGACGGC GGGGGCACGA ACAGCATCAC CGTGAGGAAT GCCACATTCA CCTGGGCCAG 2160 

GAGCGACCCT CCCACACTGA ATGGCATCAC CTTCTCCATC CCCGAAGGTG CTTTGGTGGC 2220 
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CGTGGTGGGC CAGGTGGGCT GCGGAAAGTC GTCCCTGCTC TCAGCCCTCT TGGCTGAGAT 2280 

GGACAAAGTG GAGGGGCACG TGGCTATCAA GGGCTCCGTG GCCTATGTGC CACAGCA6GC 2340 

CTGGATTCAG AATGATTCTC TCCGAGAAAA CATCCTTTTT GGATGTCAGC TGGAGGAACC 2400 

ATATTACAGG TCCGTGATAC AGGCCTGTGC CCTCCTCCCA GACCTGGAAA TCCTGCCCAG 2460 

TGGGGATCGG ACAGAGATTG GCGAGAAGGG CGTGAACCTG TCTGGGGGCC AGAAGCAGCG 2520 

CGTGAGCCTG GCCCGGGCCG TGTACTCCAA CGCTGACATT TACCTCTTCG ATGATCCCCT 2580 

CTCAGCAGTG GATGCCCATG TGGGAAAACA CATCTTTGAA AATGTGATTG GCCCCAAGGG 2640 

GATGCTGAAG AACAAGACGC GGATCTTGGT CACGCACAGC ATGAGCTACT TGCCGCAGGT 2700 

GGACGTCATC ATCGTCATGA GTGGCGGCAA GATCTCTGAG ATGGGCTCCT ACCAGGAGCT 2760 

GCTGGCTCGA GACGGCGCCT TCGCTGAGTT CCTGCGTACC TATGCCAGCA CAGAGCAGGA 2820 

GCAGGATGCA GAGGAGAACG GGGTCACGGG CGTCAGCGGT CCAGGGAAGG AAGCAAAGCA 2880 

AATGGAGAAT GGCATGCTGG TGACGGACAG TGCAGGGAAG CAACTGCAGA GACAGCTCAG 2940 

CAGCTCCTCC TCCTATAGTG GGGACATCAG CAGGCACCAC AACAGCACCG CAGAACTGCA 3000 

GAAAGCTGAG GCCAAGAAGG AGGAGACCTG GAAGCTGATG GAGGCTGACA AGGCGCAGAC 3060 

AGGGCAGGTC AAGCTTTCCG TGTACTGGGA CTACATGAAG GCCATCGGAC TCTTCATCTC 3120 

CTTCCTCAGC ATCTTCCTTT TCATGTGTAA CCATGTGTOC GCGCTGGCTT CCAACTATTG 3180 

GCTCAGCCTC TGGACTGATG ACCCCATCGT CAACGGGACT CAGGAGCACA CGAAAGTCCG 3240 

GCTGAGCGTC TATGGAGCCC TGGGCATTTC ACAAGGGATC GCCGTGTTTG GCTACTCCAT 3300 

GGCCGTGTCC ATCGGGGGGA TCTTGGCTTC CCGCTGTCTG CACGTGGACC TGCTGCACAG 3360 

CATCCTGCGG TCACCCATGA GCTTCTTTGA GCGGACCCCC AGTGGGAACC TGGTGAACCG 3420 

CTTCTCCAAG GAGCTGGACA CAGTGGACTC CATGATCCCG GAGGTCATCA AGATGTTCAT 3480 

GGGCTCCCTG TTCAACGTCA TTGGTGCCTG CATCGTTATC CTGCTGGCCA CGCCCATCGC 3540 

CGCCATCATC ATCCCGCCCC TTGGCCTCAT CTACTTCTTC GTCCAGAGGT TCTACGTGGC 3600 

TTCCTCCCGG CAGCTGAAGC GCCTCGAGTC GGTCAGCCGC TCCCCGGTCT ATTCCCATTT 3660 

CAACGAGAOC TTGCTGGGGG TCAGCGTCAT TCGAGCCTTC GAGGAGCAGG AGCGCTTCAT 3720 

CCACCAGAGT GACCTGAAGG TGGACGAGAA CCAGAAGGCC TATTACCCCA GCATCGTGGC 3780 

CAACAGGTGG CTGGCCGTGC GGCTGGAGTG TGTGGGCAAC TGCATCGTTC TGTTTGCTGC 3840 

CCTGTTTGCG GTGATCTCCA GGCACAGCCT CAGTGCTGGC TTGGTGGGCC TCTCAGTGTC 3900 

TTACTCATTG CAGGTCACCA CGTACTTGAA CTGGCTGGTT CGGATGTCAT CTGAAATGGA 3960 

AACCAACATC - GTGGCCGTGG AGAGGCTCAA GGAGTATTCA GAGACTGAGA AGGAGGCGCC 4020 

CTGGCAAATC CAGGAGACAG CTCCGCCCAG CAGCTGGCCC CAGGTGGGCC GAGTGGAATT 4080 

CCGGAACTAC TGCCTGCGCT ACCGAGAGGA CCTGGACTTC GTTCTCAGGC ACATCAATGT 4140 

CACGATCAAT GGGGGAGAAA AGGTCGGCAT CGTGGGGCGG ACGGGAGCTG GGAAGTCGTC 4200 

CCTGACCCTG GGCTTATTTC GGATCAACGA GTCTGCCGAA GGAGAGATCA TCATCGATGG 4260 

CATCAACATC GCCAAGATCG GCCTGCACGA CCTCCGCTTC AAGATCACCA TCATCCCCCA 4320 

GGACCCTGTT TTGTTTTCGG GTTCCCTCCG AATGAACCTG GACCCATTCA GCCAGTACTC 4380 

GGATGAAGAA GTCTGGACGT CCCTGGAGCT GGCCCACCTG AAGGACTTCG TGTCAGCCCT 4440 

TCCTGACAAG CTAGACCATG AATGTGCAGA AGGCGGGGAG AACCTCAGTG TCGGGCAGCG 4500 

CCAGCTTGTG TGCCTAGCCC GGGCCCTGCT GAGGAAGACG AAGATCCTTG TGTTGGATGA 4560 

GGCCACGGCA GCCGTGGACC TGGAAACGGA CGACCTCATC CAGTCCACCA TCCGGACACA 4620 

GTTCGAGGAC TGCACCGTCC TCACCATCGC CCACCGGCTC AACACCATCA TGGACTACAC 4680 

AAGGGTGATC GTCTTGGACA AAGGAGAAAT CCAGGAGTAC GGCGCCCCAT CGGACCTCCT 4740 

GCAGCAGAGA GGTCTTTTCT ACAGCATGGC CAAAGACGCC GGCTTGGTGT GAG CCCCAGA 4800 

GCTGGCATAT CTGGTCAGAA CTGCAGGGCC TATATGCCAG CGCCCAGGGA GGAGTCAGTA 4860 

CCCCTGGTAA ACCAAGCCTC CCACACTGAA ACCAAAACAT AAAAACCAAA CCCAGACAAC 4920 

CAAAACATAT TCAAAGCAGC AGCCACCGCC ATCCGGTCCC CTGCCTGGAA CTGGCTGTGA 4980 
AGACCCAGGA GAGACAGAGA TGCGAACCAC C 



SEQ ID NO:22 OBH2 Protein sequence: 
Protein Accession*: AAB46616 

1 11 21 31 41 51 

I I I I I I 

MALRGFCSAD GSDPLWDWNV TWNTSNPDFT KCFQNTVLVW VPCFYLWACF PFYFLYLSRH 60 

DRGYIQMTPL NKTKTALGFL LWIVCWADLF YSFWERSRGI FLAPVFLVSP TLLGITTLLA 120 

TFLIQLERRK GVQSSGIHLT FWLVALVCAL AILRSKIMTA LKEDAQVDLF RDITFYVYFS 180 

LLLIQLVLSC FSDRSPLFSE TIHDPNPCPE SSASFLSRIT FWWITGLIVR GYRQPLEGSD 240 

LWSLNKEDTS EQWPVLVKN WKKECAKTRK QPVKWYSSK DPAQPKESSK VDANEEVEAL 300 

IVKSPQKEWN PSLFKVLYKT FGPYFLMSFF FKAIHDLMMF SGPQILKLLI KFVNDTKAPD 360 

WQGYFYTVLL FVTACLQTLV LHQYFHICFV SGMRIKTAVI GAVYRKALVI TNSARKSSW 420 

GEIVNLHSVD AQRFMDLATY INKIWSAPLQ VILALYLLWL NLGPSVLAGV AVHVLMVPVN 480 

AVMAMKTKTY QVAHKKSKDN RIKLMNEILN GIKVLKLYAW ELAFKDKVLA IRQEELKVLK 540 

KSAYLSAVGT FTWVCTPFLV ALCTFAVYVT IDENNILDAQ TAFVSLALFN ILRFPLNILP 600 

MV1SSIVQAS VSLKRLRIFL SHEELEPDSI ERRPVKDGGG TNSITVRNAT FTWARSDPPT 660 

LNGITFSIPE GALVAWGQV GCGKSSLLSA LLAEMDKVEG HVAIKGSVAY VPQQAWIQND 720 

SLRENILFGC QLEEPYYRSV IQACALLPDL EILPSGDRTE IGEKGVNLSG GQKQRVSLAR 780 

AVYSNADIYL FDDPLSAVDA HVGKHIFENV IGPKGMLKNK TRILVTHSMS YLPQVDVIIV 840 

MSGGKISEMG SYQELLARDG AFAEFLRTYA STEQEQDAEE NGVTGVSGPG KEAKQMENGM 900 

LVTDSAGKQL QRQLSSSSSY SGDISRHHNS TAELQKAEAK KEETWKLHEA DKAQTGQVKL 960 

SVYWDYMKAI GLFISFLSIF LFMCNHVSAL ASNYWLSLWT DDPIVNGTQE HTKVRLSVYG 1020 

ALGISQGIAV FGYSHAVSIG GILASRCLHV DLLHSILRSP MSFFERTPSG NLVNRFSKEL 1080 

DTVDSMIPEV IKMFMGSLFN VIGACIVILL ATPIAAIIIP PLGLIYFFVQ RFYVASSRQL 1140 

KRLESVSRSP VYSHFNETLL GVSVIRAFEE QERFIHQSDL KVDENQKAYY PSIVANRWLA 1200 

VRLECVGNCI VLFAALFAVI SRHSLSAGLV GLSVSYSLQV TTYLNWLVRM SSEMETNIVA 1260 

VERLKEYSET EKEAPWQIQE TAPPSSWPQV GRVEFRNYCL RYREDLDFVL RHINVTINGG 1320 

EKVGIVGRTG AGKSSLTLGI* FRXNESAEGE 1 1 IDG INI AK IGLHDLRFKI TIIPQDPVLF 1380 

SGSLRMNLDP FSQYSDEEVW TSLELAHLKD FVSALPDKLD HECAEGGENL SVGQRQLVCL 1440 

ARALLRKTKI LVLDEATAAV DLETDDLIQS TIRTQFEDCT VLTIAHRLNT IMDYTRVIVL 1500 
DKGEIQEYGA PSDLLQQRGL FYSMAKDAGL V 
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SEQ ID NCM3 PAA2 DNA SEQUENCE 

Nucleic Acid Accession*: NMJH3309 

Coding sequence: 1-1290 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATCGCCGGCT CTGGCGCGTG GAAGCGCCTC AAATCTATGC TAAGGAAGGA TGATGCGCCG 60 

CTGTTTTTAA ATGACACCAG CGCCTTTGAC TTCTCGGATG AGGCGGGGGA CGAGGGGCTT 120 

TCTCGGTTCA ACAAACTTCG AGTTGTGGTG GCCGATGACG GTTCCGAAGC CCCGGAAAGG 180 

CCTGTTAACG GGGCGCACCC GACCCTCCAG GCCGACGATG ATTCCTTACT GGACCAAGAC 240 

TTACCTTTGA CCAACAGTCA GCTGAGTTTG AAGGTGGACT CCTGTGACAA CTGCAGCAAA 300 

CAGAGAGAGA TACTGAAGCA GAGAAAGGTG AAAGCCAGGT TGACCATTGC TGCCGTTCTG 360 

TACTTGCTTT TCATGATTGG AGAACTTGTA GGTGGATACA TTGCAAATAG CCTAGCAATC 420 

ATGACAGATG CACTTCATAT GTTAACTGAC CTAAGCGCCA TCATACTCAC CCTGCTTGCT 480 

TTGTGGCTAT CATCAAAATC ACCAACCAAA AGATTCACCT TTGGATTTCA TCGCTTAGAG 540 

GTTTTGTCAG CTATGATTAG TGTGCTGTTG GTGTATATAC TTATGGGATT CCTCTTATAT 600 

GAAGCTGTGC AAAGAACTAT CCATATGAAC TATGAAATAA ATGGAGATAT AATGCTCATC 660 

ACCGCAGCTG TTGGAGTTGC AGTTAATGTA ATAATGGGGT TTCTGTTGAA CCAGTCTGGT 720 

CACCGTCACT CCCATTCCCA CTCCCTGCCT TCAAATTCCC CTACCAGAGG TTCTGGGTGT 780 

GAACGTAACC ATGGGCAGGA TAGCCTGGCA GTGAGAGCTG CATTTGTACA TGCTTTGGGA 840 

GATTTGGTAC AGAGTGTTGG TGTGCTAATA GCTGCATACA TCATACGATT CAAGCCAGAA 900 

TACAAGATTG CTGATCCCAT CTGTACATAC GTATTTTCAT TACTTGTGGC TTTTACAACA 960 

TTTCGAATCA TATGGGATAC AGTAGTTATA ATACTAGAAG GTGTGCCAAG CCATTTGAAT 1020 

GTAGACTATA TCAAAGAAGC CTTGATGAAA ATAGAAGATG TATATTCAGT CGAAGATTTA 1080 

AATATCTGGT CTCTCACTTC AGGAAAATCT ACTGCCATAG TTCACATACA GCTAATTCCT 1140 

GGAAGTTCAT CTAAATGGGA GGAAGTACAG TCCAAAGCAA ACCATTTATT ATTGAACACA 1200 

TTTGGCATGT ATAGATGTAC TATTCAGCTT CAGAGTTACA GGCAAGAAGT GGACAGAACT 1260 
TGTGCAAATT GTCAGAGTTC TAGTCCCTGA 



SEQ ID NO:24 PAA2 Proton sequence: 
ProtelnAccesslon#: NP.037441 

1 11 21 31 41 51 

I I I I I I 

MAGSGAWKRL KSMLRKDDAP LFLNDTSAFD FSDEAGDEGL SRFNKLRVW ADDGSEAPER 60 
PVNGAHPTLQ ADDDSLLDQD LPI/ENSQLSL KVDSCDNCSK QREILKQRKV KARLTIAAVL 120 
YLLFMIGELV GGYIANSLAI MTDALHMLTD LSAIILTLLA LWLSSKSPTK RFTFGFHRLE 180 
VLSAMISVLL VYILMGFLLY EAVQRTTHMN YEINGDIMLI TAAVGVAVNV IMGFLLNQSG 240 
HRHSHSHSLP SNSPTRGSGC ERNHGQDSLA VRAAFVHALG DLVQSVGVLI AAYIIRFKPE 300 
YKIADPICTY VFSLLVAFTT FRIIWDTWI ILEGVPSHLN VDYIKEALMK IEDVYSVEDL 360 
NIWSLTSGKS TAIVHIQLIP GSSSKWEEVQ SKANHLLLNT FGMYRCTIQL QSYRQEVDRT 420 
CANCQSSSP 



SEQ ID NO^ PAA3 DNA SEQUENCE 

Nudete Add Accession*. AB037765 

Cod/no sequence: 375-2798 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

GCCGAGTCGG TGGCGGCTGC AGGCTGGGAG 
AAGTGGTTCC AGGCTACCCG GCTAGTCTGG 
CGCGTGGCGG OGGGAACTGT TGGCCGCGCG 
AGGTCCCGGG CAGATAACAT AGATCATCAG 
ATTTGAAAGT AGCAAAATAG AAAATAAAGA 
AGTGTTGTCT TAGGAAACAG AACACAGCAG 

AACTGCAGCT gataatgttt tccggcttca 

TAATGTGCAT TTTTTACATG CCAACAGTAA 
ATTTTAGTAC ATTGCAACCA GGTCTTGAAG 
ACTATGGAAT TTCAGTTGCC AAGGTTAATT 
GAAAAGAAAA GGATTTGATG AAAGCATATT 
TCCCTACTGA CACCTTGTTT GATGTGAATG 
TTTTTAGTGA AGTGAAATAT ATTACCAACC 
TGAAAGGAAA AGCAAATATT ATATTCTCAT 
GAGCAGTCAT GGAAGCCGGT TTTGTGTATG 
AAATTGCCCT TTTGGAAAGT ATTGGCTCTG 
TTCATTGTAA ACTAGTCTTG GACTTGACCC 
CATTGACTAC ACTGAACATT CACCTGTTTA 
AAGTTGCTGA AGATCCTCAA CAAGTTTCAA 
TTTTTATTGT TAGCCAACAG GCTACTTATG 
CTTGGCGTCT TCTGGGAAAA GCAGGAGTTC 
ACATTCCTCA AGATGCTAAT GTGGTCTTCA 
TTTTGGTATT ACATGATGTT GATTTAATAA 
AGGAAATACA AGAAGATGAA GACAATGACA 
ATGAAGTGGC AGAAACTGTT TTCAGAGATA 



31 41 51 

I I I 

GGAGAAGTGC TACGCCTTTG CAGGTTGGCG 60 

CACGGCCCCG TCTTCTGCCT CCTCCTCCGT 120 

GCCTCGGGAA CGGCCCAGGT CCCCGCCCGC 180 

TAGAAAACTT CTTGAAGTTG TTCAAGAAAA 240 

ATTAACAGCA GATACAGAGG ACAGCATGGA 300 

TGAAAAAACA GACAAAATCC GCTCAGATAC 360 

ATGTCTTTAG AGTTGGGATC TCTTTTGTCA 420 

ACTCTTTACC AGAACTGAGT CCTCAGAAAT 480 

AACTGAATGA GGCTGTTAGA CCTCTGCAGG 540 

GTGTCAAAGA AGAAATATCA AGATACTGTG 600 

TATTCAAGGG CAACATATTG CTCAGAGAAT 660 

CCATTGTCGC CCATGTTCTC TTTGCTCTTC 720 

TGGAAGACCT TCAGAACATA GAAAATGCTC 780 

ATGTAAGAGC CATTGGAATA CCAGAGCACA 840 

GG ACT AC ATA CCAATTTGTC TTAACCACAG 900 

AGGATGTGGA ATATGCACAT CTCTACTTTT 960 

AGCAATGTAG AAGAACACTA ATGGAACAGC 1020 

TTAAGACAAT GAAAGCACCT CTGTTGACTG 1080 

CTGTCCATCT CCAACTGGGC TTACCACTGG 1140 

AAGCTGATAG AAGAACTGCA GAATGGGTTG 1200 

TACTCTTGTT AAGGGACTCT TTGGAAGTGA 1260 

AAAGAGCAGA AGAGGGAGTT CCAGTGGAAT 1320 

TATCTCATGT GGAAAATAAT ATGCACATTG 1380 

TGGAAGGTCC AGATATAGAT GTTCAGGATG 1440 

GGAAGAGAAA ATTACCTTTG GAACTTACAG 1S00 
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TGGAACTAAC AGAAGAAACA TTTAATGCAA CAGTGATGGC TTCTGACAGC ATAGTACTCT 1560 

TCTATGCTGG TTGGCAAGCA GTATCCATGG CATTTTTGCA ATCCTATATT GATGTGGCAG 1620 

TTAAACTGAA AGGCACATCT ACTATGCTTC TTACTAGAAT AAACTGTGCA GATTGGTCTG 1680 

ATGTATGTAC TAAGCAAAAT GTTACTGAAT TTCCTATCAT AAAGATGTAC AAGAAAGGCG 1740 

AGAACCCAGT ATCTTATGCT GGAATGTTAG GAACCAAAGA TCTCCTAAAA TTTATCCAGC 1800 

TCAACAGGAT TTCATATCCA GTGAATATAA CATCGATCCA AGAAGCAGAA GAATATTTAA 1860 

GTGGGGAATT ATATAAAGAC CTCATCTTGT ATTCTAGTGT GTCAGTATTG GGACTATTTA 1920 

GTCCAACCAT GAAAACAGCA AAAGAAGATT TTAGTGAAGC AGGAAACTAC CTAAAAGGAT 1980 

ATGTTATCAC TGGAATTTAT TCTGAAGAAG ATGTTTTGCT ACTGTCAACC AAATATGCTG 2040 

CAAGTCTTCC AGCCCTGCTG CTTGCCAGAC ACACAGAAGG CAAAATAGAG AGCATCCCAC 2100 

TAGCTAGCAC ACATGCACAA GACATAGTTC AAATAATAAC AGATGCACTA CTGGAAATGT 2160 

TTCCGGAAAT CACTGTGGAA AATCTTCCCA GTTATTTCAG ACTTCAGAAA raVTTATTGA 2220 

TTTTGTTCAG TGATGGCACT GTAAATCCTC AATATAAAAA AGCAATATTG ACACTGGTAA 2280 

AGCAGAAATA CTTGGATTCA TTTACTCCAT GCTGGTTAAA TCTAAAGAAT ACTCCAGTGG 2340 

GGAGAGGAAT CTTGCGGGCA TATTTTGATC CTCTGCCTCC CCTTCCTCTT CTTGTTTTGG 2400 

TGAATCTGCA TTCAGGTGGC CAAGTATTTG CATTTCCTTC AGACCAGGCT ATAATTGAAG 2460 

AAAACCTTGT ATTGTGGCTG AAGAAATTAG AAGCAGGACT AGAAAATCAT ATCACAATTT 2520 

TACCTGCTCA AGAATGGAAA CCTCCTCTTC CAGCTTATGA TTTTCTAAGT ATGATAGATG 2580 

CCGCAACATC TCAACGTGGC ACTAGGAAAG TTCCCAAGTG TATGAAAGAA ACAGATGTGC 2640 

AGGAGAATGA TAAGGAACAA CATGAAGATA AATCGGCAGT CAGAAAAGAA CCGATTGAAA 2700 

CTCTGAGAAT AAAGCATTGG AATAGAAGTA ATTGGTTTAA AGAAGCAGAA AAATCATTTA 2760 

GACGTGATAA AGAGTTAGGA TGCTCAAAAG TGAACTAATT TTATAGGGCT GTGGTTTCCA 2820 

AAATTTTTTT GGCATGATAG ACTTAATTTA TTTCCTTAAA GAATAATATT AAATCATTTC 2880 

AAGTTTGCAG ACTAGTGCCA TCCAATAGAA TTATAATATA AGTCACATAT TTTATTTAAA 2940 

ATTTTCTAGT AACTACATTA AACAAAGTAA AAGTGAGCAG GGCAAAATAA TTTTGATATT 3000 

ACTTTTCACC CAGTAGTATA CCCAAAATAG CGAAATATAG AAATTATTAA TGAGATATTT 3060 

TACATCCTTT TTTGTACCAA GTCTTCTAAA TGCAGTACAT ATTTTATACT TACTGCATTT 3120 

CTTACTTCCG AGTAGCCATA TTTCAAGTGT TCATTGCCAC ATGTGGCCTG TGACTACTGT 3180 

ATTGGACAGT TCAGTACTAG ACAAAAACTA GCATAATTAA CTTAGTTCTA GCCATGATTT 3240 

CTATTTGGAT TAAAATTAAA CTCTAATCAC AGTTAACTCC ACAGTGCATT CATGCAGCTG 3300 

ACAGTTATAT TTGTTTTATT GGAGTCATGA TATTAAAATC AGCGTTTGTC AACCTCAGGG 3360 

GATATTTAGC AATTGTCGGG AGACATTTTT GATGTCATGA CTAGGGCAGT TATTGACATT 3420 

TAGTGAGTAG AGGCCATGGA TCCTGCTAAA TAACCTGCAT TGGACAGCGC CCCACAACAA 3480 

AGAATTATCC TGCCCGAAAT GGTAGTCGTG CCAAGGCTGA GTAACCTTGT GTTAAAAGTA 3540 

ACCTGTGGCA GACTAGGTTT CCAGAATTTC CTGGTTCTGC TCACGTATCA TGTTTGAAAA 3600 

AATTTTGGCT ATTAAAGATA TGTATTAGAT GGTCTTATCC TGATTATTAC CTGGATACAA 3660 

CTTGATCTTT TCTAATATTT TCAGAAAGTG ATGGGATAAC CCTAGAAGAG GACTCAGAAT 3720 

GATATTTATA TTTTAAGTGA GTCTTAAAAC CTCCTCTTAT TTCTACAAGT TATATGGCTA 3780 

AATTTCAGAT TGAACAGGGA TTCAGCATTC TGCCATCTCC TCATGGAAAG AGAGGCTCCC 3840 

TCATCTGAAG CGTCTCTGAA ATCTACCCTT GCAAGCTTCA GACAAATCAG TTGATCTCCC 3900 

TGAGCCACAC GGCCTCATTC TGTGAGGGAG GGAAAGATTA GCCAAAGAGT TAATTTTCAT 3960 

TCCAAATCAC TTAGCTGTTA GACTGATCTG TTTGTAGCAG TTGTTTGTCT CATTTTTGCT 4020 

CTGTGCATTT TTTGAGACAT TTGTTGAGAA TATTCTATTT GGTGCTCTAC TGTATTTTTC 4080 

TTTTTAATAT CTACTTGATA TCTTGTTCTT TAAATTTTCT TCACATATGG TTTGCCTGAT 4140 

ACAACTGATT TTTATAACTG AAATTTAAGG AATCTAACAG CTAAAACTCA GTAAGTGCAT 4200 

MTATTTCCTT ATAACATAGA CCCGTTGCTA CTCTCAGCAC CCTCTCCTCA ATTTTTTTTC 4260 

CTGTAGCATG TGATGCCTGA TTAAACTCAT TTTCATTTGC TTTTATTTCT AATATGGGAA 4320 

CAATGAGAGT GAACTCTAAA TATAGGTTGT AGTAATAAAA CATCATTAGC CTAATTATTA 4380 

GAAAATGCTA ATTAAGTACC AGCACATAGA AACATGAAAT TGCTTAGTCA TTGTACCTTT 4440 

GTCAGCAATT TTGACAGTCA TTAATGTTTG TCATAATTTT AAATAAAGTG TCTGGGTTTC 4500 
AGAATACCTT CAAAAAAAAA AAAAAA 

SEQ H) N036 PAA? Pffleln segyw?; 
Protein Accession #: BAA92582 

1 11 21 31 41 51 

I I I I I I 

MFSGFNVFRV GISFVIMCIF YMPTVNSLPE LSPQKYFSTL QPGLEELNEA VRPLQDYGIS 60 

VAKVNCVKEE ISRYCGKEKD LMKAYLFKGN ILLREFPTDT LFDVNATVAH VLFALLFSEV 120 

KYITNLEDLQ NIENALKGKA NIIFSWRAI GIPEHRAVME AGFVYGTTYQ FVLTTEIALL 180 

ESIGSEDVEY AHLYFFHCKL VLDLTQQCRR TLMEQPLTTL NIHLFIKTMK APLLTEVAED 240 

PQOVSTVHLQ LGLPLVFIVS QQATYEADRR TAEWVAWRLL GKAGVLLLLR DSLEVNIPQD 300 

ANWFKRAEE GVPVEFLVLH DVDLIISHVE NNMHIEEIQE DEDNDMEGPD IDVQDDEVAE 360 

TVFRDRKRKL PLELTVELTE ETFNATVMAS DSIVLFYAGW QAVSMAFLQS YIDVAVKLKG 420 

TSTMLLTRIN CADWSDVCTK QNVTEFPIIK MYKKGENPVS YAGMLGTKDL LKFIQLWRIS 480 

YPVNITSIQE AEEYLSGELY KDLILYSSVS VLGLFSPTMK TAKEDFSEAG NYLKGYVITG 540 

IYSEEDVLLL STKYAASLPA LLLARHTEGK IESIPLASTH AQDIVQIITD ALLEMFPEIT 600 

VENLPSYFRL QKPLLILFSD GTVNPQYKKA ILTLVKQKYL DSFTPCWLNL KNTPVGRGIL 660 

RAYFDPLPPL PLLVLVNLHS GGQVFAFPSD QAIIEENLVL WLKKLEAGLE NHITILPAQE 720 

WKPPLPAYDF LSMIDAATSQ RGTRKVPKCM KETDVQENDK EQHEDKSAVR KEPIETLRIK 780 
HWNRSNWFKE AEKSFRRDKE LGCSKVN 

SEQ ID NO:27 PAA5 DNA SEQUENCE 

Nucleic Acid Accession*: NM_0 12449 

Coding sequence: 66-1085 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CCGAGACTCA CGGTCAAGCT AAGGCGAAGA GTGGGTGGCT GAAGCCATAC TATTTTATAG 60 

AATTAATGGA AAGCAGAAAA GACATCACAA ACCAAGAAGA ACTTTGGAAA ATGAAGCCTA 120 
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GGAGAAATTT AGAAGAAGAC GATTATTTGC 
AAAGACCTGT GCTTTTGCAT TTGCACCAAA 
CAGAACTTCA GCACACACAG GAACTCTTTC 
CTATTATAGC ATCTCTGACT TTTCTTTACA 
CAACTTCCCA TCAACAATAT TTTTATAAAA 
CAATGGTTTC CATCACTCTC TTGGCATTGfG 
TCCAACTTCA TAATGGAACC AAGTATAAGA 
TAACAAGAAA GCAGTTTGGG CTTCTCAGTT 
GTCTGTCTTA CCCAATGAGG CGATCCTACA 
AGGTCCAACA AAATAAAGAA GATGCCTGGA 
ATGTGTCTCT GGGAATTGTG GGATTGGCAA 
CATCTGTGAG TGACTCTTTG ACATGGAGAG 
TTGTTTCCCT TCTACTGGGC ACAATACACG 
ATATAAAACA ATTTGTATGG TATACACCTC 
TTGTTGTCCT GATATTTAAA AGCATACTAT 
AGATTAGACA TGGTTGGGAA GACGTCACCA 
T GTAGA ATTA CTGTTTACAC ACATTTTTGT 
TCAAGTTTGT ATTTGTTAAT AAAATGATTA 



ATAAGGACAC GGGAGAGACC AGCATGCTAA 180 

CAGCCCATGC TGATGAATTT GACTGCCCTT 240 

CACAGTGGCA CTTGCCAATT AAAATAGCTG 300 

CTCTTCTGAG GGAAGTAATT CACCCTTTAG 360 

TTCCAATCCT GGTCATCAAC AAAGTCTTGC 420 

TTTACCTGCC AGGTGTGATA GCAGCAATTG 480 

AGTTTCCACA TTGGTTGGAT AAGTGGATGT 540 

TCTTTTTTGC TGTACTGCAT GCAATTTATA 600 

GATACAAGTT GCTAAACTGG GCATATCAAC 660 

TTGAGCATGA TGTTTGGAGA ATGGAGATTT 720 

TACTGGCTCT GTTGGCTGTG ACATCTATTC 780 

AATTTCACTA TATTCAGAGC AAGCTAGGAA 840 

CATTGATTTT TGCCTGGAAT AAGTGGATAG 900 

CAACTTTTAT GATAGCTGTT TTCCTTCCAA 960 

TCCTGCCATG CTTGAGGAAG AAGATACTGA 1020 

AAATTAACAA AACTGAGATA TGTTCCCAGT 1080 

TCAATATTGA TATATTTTAT CACCAACATT 1140 
TTCAAGGAAA AAAAAAAAAA AAAAA 



SEQ ID W038 PAA5 Protein sequence 
Protein Accession*: NPJJ36S81 

1 11 21 31 41 51 

I I I I I I 

MESRKDITNQ EELWKMKPRR NLEEDDYLHK DTGETSMLKR PVLLHLHQTA HADEFDCPSE 60 

LQHTQELFFQ WHLPIKIAAI IASLTFLYTL LREVTHPLAT SHQQYFYXIP ILVINKVLPM 120 

VSITLLALVY LPGVIAAIVQ LHNGTKYKKF PHWLDKWMLT RKQFGLLSFF FAVLHAIYSL 180 

SYPMRRSYRY KLLNWAYQQV QQNKEDAWIE HDVWRMEIYV SLGIVGLAIL ALLAVTSIPS 240 

VSDSLTWREF HYIQSKLGIV SLLLGTIHAL IFAWNKWIDI KQFVWYTPPT FMIAVFLPIV 300 
VLIFKSILFL PCLRKKILKI RHGWEDVTKI NKTEICSQL 



SEQ ID N0:29 PAA7 DNA SEQUENCE 

Nucleic Add Accession*: NM.030774 

Coding sequence: 1-983 (undefined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I i I I I I 

ATGAGTTCCT GCAACTTCAC ACATGCCACC TTTGTGCTTA TTGGTATCCC AGGATTAGAG 60 

AAAGCCCATT TCTGGGTTGG CTTCCCCCTC CTTTCCATGT ATGTAGTGGC AATGTTTGGA 120 

AACTGCATCG TGGTCTTCAT CGTAAGGACG GAACGCAGCC TGCACGCTCC GATGTACCTC 180 

TTTCTCTGCA TGCTTGCAGC CATTGACCTG GCCTTATCCA CATCCACCAT GCCTAAGATC 240 

CTTGCCCTTT TCTGGTTTGA TTCCCGAGAG ATTAGCTTTG AGGCCTGTCT TACCCAGATG 300 

TTCTTTATTC ATGCCCTCTC AGCCATTGAA TCCACCATCC TGCTGGCCAT GGCCTTTGAC 360 

CGTTATGTGG CCATCTGCCA CCCACTGCGC CATGCTGCAG TGCTCAACAA TACAGTAACA 420 

GCCCAGATTG GCATCGTGGC TGTGGTCCGC GGATCCCTCT TTTTTTTCCC ACTGCCTCTG 480 

CTGATCAAGC GGCTGGCCTT CTGCCACTCC AATGTCCTCT CGCACTCCTA TTGTGTCCAC 540 

CAGGATGTAA TGAAGTTGGC CTATGCAGAC ACTTTGCCCA ATGTGGTATA TGGTCTTACT 600 

GCCATTCTGC TGGTCATGGG CGTGGACGTA ATGTTCATCT CCTTGTCCTA TTTTCTGATA 660 

ATACGAACGG TTCTGCAACT GCCTTCCAAG TCAGAGCGGG CCAAGGCCTT TGGAACCTGT 720 

GTGTCACACA TTGGTGTGGT ACTCGCCTTC TATGTGCCAC TTATTGGOCT CTCAGTGGTA 780 

CACCGCTTTG GAAACAGCCT TCATCCCATT GTGCGTGTTG TCATGGGTGA CATCTACCTG 840 

CTGCTGCCTC CTGTCATCAA TCCCATCATC TATGGTGCCA AAACCAAACA GATCAGAACA 900 

CGGGTGCTGG CTATGTTCAA GATCAGCTGT GACAAGGACT TGCAGGCTGT GGGAGGCAAG 960 

TGACCCTTAA CACTACACTT CTCCTTATCT TTATTGGCTT GATAAACATA ATTATTTCTA 1020 

ACACTAGCTT ATTTCCAGTT GCCCATAAGC ACATCAGTAC TTTTCTCTGG CTGGAATAGT 1080 

AAACTAAAGT ATGGTACATC TACCTAAAGG ACTATTATGT GGAATAATAC ATACTAATGA 1140 

AGTATTACAT GATTTAAAGA CTACAATAAA ACCAAACATG CTTATAACAT TAAGAAAAAC 1200 

AATAAAGATA CATGATTGAA ACCAAGTTGA AAAATAGCAT ATGCCTTGGA GGAAATGTGC 1260 

TCAAATTACT AATGATTTAG TGTTGTCCCT ACTTTCTCTC TCTTTTTTCT TTCTTTTTTT 1320 

TTTATTATGG TTAGCTGTCA CATACAACTT TTTTTTTTTT TGAGATGGGG TCTCGCTCTG 1380 

TCACCAGGCT GGAGTGCAGT GGCGCGATCT CGGCTCACTG CAACCTCCAC ATCCCATGTT 1440 

GAAGTAATTC TTCTGCCTCA GCCTCCCGAG TAGCTGGGAC TAGAGGAACG TGCCACCATG 1500 

ACTGGCTAAT TTTCTGTATT TTTTAGTAGA GACAGAGTTT CACCATGTTG GCCAGGATGG 1560 

TCTCGATCTC CTGACCTTGT GATCCACCCG CCTCAGCCTC CCAAAGTGTT GGGATTACAG 1620 

GTGTGAACCA CTGTGCCCGG CCTGTGTACA ACTTTTTAAA TAGGGAATAT GATAGCTTCG 1680 

CATGGTGGTG TGCACCTATA GCCCCCACTG CCTGGAAAGC TGAGGTGGGA GAATCGCTTG 1740 

AGTCCAGGAG TTTGAGGTTA CAGTGATCCA CGATCGTACC ACTACACTCC AGCCTGGGCA 1800 

ACAGAGCAAG ACCCTGTCTC AAAGCATAAA ATGGAATAAC ATATCAAATG AAACAGGGAA 1860 

AATGAAGCTG ACAATTTATG GAAGCCAGGG CTTGTCACAG TCTCTACTGT TATTATGCAT 1920 

TACCTGGGAA TTTATATAAG CCCTTAATAA TARTGCCAAT GAACATCTCA TGTGTGCTCA 1980 

CAATGTTCTG GCACTATTAT AAGTGCTTCA CAGGTTTTAT GTGTTCTTCG TAACTTTATG 2040 

GAGTAGGTAC CATTTGTGTC TCTTTATTAT AAGTGAGAGA AATGAAGTTT ATATTATCAA 2100 

GGGGACTAAA GTCACACGGC TTGTGGGCAC TGTGCCAAGA TTTAAAATTA AATTTGATGG 2160 

TTGAATACAG TTACTTAATG ACCATGTTAT ATTGCTTCCT GTGTAACATC TGCCATTTAT 2220 

TTCCTCAGCT GTACAAATCC TCTGTTTTCT CTCTGTTACA CACTAACATC AATGGCTTTG 2280 

TACTTGTGAT GAGAGATAAC CTTGCCCTAG TTGTGGGCAA CACATGCAGA ATAATCCTGT 2340 

TTTACAGCTG CCTTTCGTGA TCTTATTGCT TGCTTTTTTC CAGATTCAGG GAGAATGTTG 2400 

TTGTCTATTT GTCTCTTACA TCTCCTTGAT CATGTCTTCA TTTTTTAATG TGCTCTGTAC 2460 

CTGTCAAAAA TTTTGAATGT ACACCACATG CTATTGTCTG AACTTGAGTA TAAGATAAAA 2520 
TAAAATTTTA TTTTAAATTT T 
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SEP ID NO:30 PAA7 PROTEIN SEQUENCE 

Protein Accession!: NFL! 10401 

1 11 21 31 41 51 

I I I I I I 

MSSCNFTHAT FVLIGIPGLE KAHFWVGFPL LSMYWAMFG NCIWFIVRT ERSLHAPMYL 60 

FLCMLAAIDL ALSTSTOPKI LALFWFDSRE ISFEACLTQM FFIHALSAIE STILLAMAFD 120 

RYVAICHPLR HAAVLNNTVT AQIGIVAWR GSLFFFPLPL LIKRLAFCHS NVLSHSYCVH 180 

QBVMKLAYAD TLPNWYGLT AILLVMGVDV MFISLSYFLI IRTVLOLPSK SERAKAFGTC 240 

VSHIGWLAF YVPLIGLSW HRFGNSLHPI VRWMGDIYI* LLPPVINPII YGAKTKQIRT 300 
RVLAMFKISC DKDLQAVGGK 

SEQ 10 N0:31 PAV6 DMA SEQUENCE 

Nucleic Acid Accession*: XMJ)50837 

Coding sequencer 1-1020 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGAACTGGG AGCTGCTGCT GTGGCTGCTG GTGCTGTGCG CGCTGCTCCT GCTCTTGGTG 60 

CAGCTGCTGC GCTTCCTGAG GGCTGACGGC GACCTGACGC TACTATGGGC CGAGTGGCAG 120 

GGACGACGCC CAGAATGGGA GCTGACTGAT ATGGTGGTGT GGGTGACTGG AGCCTCGAGT 180 

GGAATTGGTG AGGAGCTGGC TTACCAGTTG TCTAAACTAG GAGTTTCTCT TGTGCTGTCA 240 

GCCAGAAGAG TGCATGAGCT GGAAAGGGTG AAAAGAAGAT GCCTAGAGAA TGGCAATTTA 300 

AAAGAAAAAG ATATACTTGT TTTGCCCCTT GACCTGACCG ACACTGGTTC CCATGAAGCG 360 

GCTACCAAAG CTGTTCTCCA GGAGTTTGGT AGAATCGACA TTCTGGTCAA CAATGGTGGA 420 

ATGTCCCAGC GTTCTCTGTG CATGGATACC AGCTTGGATG TCTACAGAAA GCTAATAGAG 480 

CTTAACTACT TAGGGACGGT GTCCTTGACA AAATGTGTTC TGCCTCACAT GATCGAGAGG 540 

AAGCAAGGAA AGATTGTTAC TGTGAATAGC ATCCTGGGTA TCATATCTGT ACCTCTTTCC 600 

ATTGGATACT GTGCTAGCAA GCATGCTCTC CGGGGTTTTT TTAATGGCCT TCGAACAGAA 660 

CTTGCCACAT ACCCAGGTAT AATAGTTTCT AACATTTGCC CAGGACCTGT GCAATCAAAT 720 

ATTGTGGAGA ATTCCCTAGC TGGAGAAGTC ACAAAGACTA TAGGCAATAA TGGAGACCAG 780 

TCCCACAAGA TGACAACCAG TCGTTGTGTG CGGCTGATGT TAATCAGCAT GGCCAATGAT 840 

TTGAAAGAAG TTTGGATCTC AGAACAACCT TTCTTGTTAG TAACATATTT GTGGCAATAC 900 

ATGCCAACCT GGGCCTGGTG GATAACCAAC AAGATGGGGA AGAAAAGGAT TGAGAACTTT 960 
AAGAGTGGTG TGGATGCAGA CTCTTCTTAT TTTAAAATCT TTAAGACAAA ACATGACTGA 

Pro^n Action #f ^^050^7 

1 11 21 31 41 51 

1 I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

GIGEELAYQL . SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 

ATKAVLQEFG RIDILVNNGG HSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 180 

KQGK1VTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTE LATVPGIIVS NICPGPVQSN 240 

TVENSIAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWWriK KMGKKRJENF KSGVDADSSY FKDFKTKHD 

SEQ 10 NO:33 PBA6 DNA SEQUENCE 

Nucleic Acid Accession*: NM.006853 v 
Coding sequence: 26-874 (underfined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 60 

ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGCC CGCTCCTCCC CCCTCCAGGC 120 

CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 180 

CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG CCCTGGCAGG CAGCCCTGTT 240 

CGAGAAGACG CGGCTACTCT GTGGGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAGC 300 

AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 360 

GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 420 

CAGCCTCCCC AACAAAGACC ACCGCAATGA CATCATGCTG GTGAAGATGG CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA CGCTGTGTCA CTGCTGGCAC 540 

CAGCTGCCTC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 600 

CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 720 

GGGTGACTCC GGGGGCCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA TCTCCTGGGG 780 

CCAGGATCCG TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 840 

GGACTGGATC CAGGAGACGA TGAAGAACAA TTAGACTGGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGGACTAC AGGAGATGCT GTCACTTAAT 1020 

AATCAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTGT 1080 

GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 
TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 

SEP ID NO:34 PBA6 PROTEIN SEQUENCE 

Protein Accession #: NP.006844 
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PCTAJS01/32045 



1 11 21 31 41 51 

I I I I I I 

- MRILQLILLA LATGLVGGET RIIKGFECKP HSQPWQAALF EKTRLLCGAT LIAPRWLLTA 60 
3 AHCLKPRYIV HLGQHNLQKE EGCEQTRTAT ESFPHPGFNN SLPNKDHRND IMLVKMASPV 120 

SITWAVRPLT LSSRCVTAGT SCLISGWGST SSPQLRLPHT LRCANITIIE HQKCENAYPG 180 

NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWG QDPCAITRKP GVYTKVCKYV 240 
DWIQETMKNN 

1 0 SEQ ID N0:35 PBC1 DNA SEQUENCE 

Nucleic Add Accession* NMJXU775 

Coding sequence: 70-972 (undefined sequences correspond to start and stop codons) 

.-1 11 21 31 41 51 

15 I | | II I 

CTAAAGCTCT CTTGCTGCCT AGCCTCCTGC CGGCCTCATC TTCGCCCAGC CAACCCCGCC 60 

TGGAGCCC TA TGG CCAACTG CGAGTTCAGC CCGGTGTCCG GGGACAAACC CTGCTGCCGG 120 

CTCTCTAGGA GAGCCCAACT CTGTCTTGGC GTCAGTATCC TGGTCCTGAT CCTCGTCGTG 180 

0 _ GTGCTCGCGG TGGTCGTOCC GAGGTGGCGC CAGACGTGGA GCGGTCCGGG CACCACCAAG 240 

20 CGCTTTCCCG AGACCGTCCT GGCGCGATGC GTCAAGTACA CTGAAATTCA TCCTGAGATG 300 

AGACATGTAG ACTGCCAAAG TGTATGGGAT GCTTTCAAGG GTGCATTTAT TTCAAAACAT 360 

CCTTGCAACA TTACTGAAGA AGACTATCAG CCACTAATGA AGTTGGGAAC TCAGACCGTA 420 

CCTTGCAACA AGATTCTTCT TTGGAGCAGA ATAAAAGATC TGGCCCATCA GTTCACACAG 480 

- GTCCAGCGGG ACATGTTCAC CCTGGAGGAC ACGCTGCTAG GCTACCTTGC TGATGACCTC 540 
25 ACATGGTGTG GTGAATTCAA CACTTCCAAA ATAAACTATC AATCTTGCCC AGACTGGAGA 600 

AAGGACTGCA GCAACAACCC TGTTTCAGTA TTCTGGAAAA CGGTTTCCCG CAGGTTTGCA 660 

GAAGCTGCCT GTGATGTGGT CCATGTGATG CTCAATGGAT CCCGCAGTAA AATCTTTGAC 720 

AAAAACAGCA CTTTTGGGAG TGTGGAAGTC CATAATTTGC AACCAGAGAA GGTTCAGACA 780 

OA CTAGAGGCCT GGGTGATACA TGGTGGAAGA GAAGATTCCA GAGACTTATG CCAGGATCCC 840 

30 ACCATAAAAG AGCTGGAATC GATTATAAGC AAAAGGAATA TTCAATTTTC CTGCAAGAAT 900 

ATCTACAGAC CTGACAAGTT TCTTCAGTGT GTGAAAAATC CTGAGGATTC ATCTTGCACA 960 

TCTGAGAT CT GAG CCAGTCG CTGTGGTTGT TTTAGCTCCT TGACTCCTTG TGGTTTATGT 1020 

CATCATACAT GACTCAGCAT ACCTGCTGGT GCAGAGCTGA AGATTTTGGA GGGTCCTCCA 1080 

- CAATAAGGTC AATGCCAGAG ACGGAAGCCT TTTTCCCCAA AGTCTTAAAA TAACTTATAT 1140 
35 CATCAGCATA CCTTTATTGT GATCTATCAA TAGTCAAGAA AAATTATTGT ATAAGATTAG 1200 

AATGAAAATT GTATGTTAAG TTACTTCCTT TAG 
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SEQ 10 NO:36 PBC1 Protein seouence 
Protein Accession*: NP,001766 



1 11 21 31 41 51 

I I I I I I 

MANCEFSPVS GDKPCCRLSR RAQLCLGVSI LVLILWVLA VWPRWRQTW SGPGTTKRFP 60 

ETVLARCVKY TEIHPEKRHV DCQSVWDAFK GAPISKHPCN ITEEDYQPLM KLGTQTVPCN 120 

45 KILLW5RIKD LAHQFTQVQR DMFTLEDTLL GYLADOLTWC GEPMTSK1NY QSCPDWRKDC 180 

SNNPVSVFWK TVSRRFAEAA CDWHVMLNG SRSKIFDKNS TFGSVEVHNL QPEKVQTLEA 240 
WVIHGGREDS RDLCQDPTIK ELESIISKRN IQPSCKNIYR PDKPLQCVKN PEDSSCTSEI 

_._ SEQ ID KO:37 PBH1 DNA SEQUENCE 

50 Nucleic Add Accession #: XML017718 

Cooing sequence: 1-3315 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

« I I I I I I 

ATGTCCTTTC GGGCAGCCAG GCTCAGCATG AGGAACAGAA GGAATGACAC TCTGGACAGC 60 

ACCCGGACCC TGTACTCCAG CGCGTCTCGG AGCACAGACT TGTCTTACAG TGAAAGCGAC 120 

TTGGTGAATT TTATTCAAGC AAATTTTAAG AAACGAGAAT GTGTCTTCTT TACCAAAGAT 180 

TCCAAGGCCA CGGAGAATGT GTGCAAGTGT GGCTATGCCC AGAGCCAGCA CATGGAAGGC 240 

- _ ACCCAGATCA ACCAAAGTGA GAAATGGAAC TACAAGAAAC ACACCAAGGA ATTTCCTACC 300 

60 GACGOCTTTG GGGATATTCA GTTTGAGACA CTGGGGAAGA AAGGGAAGTA TATACGTCTG 360 

TCCTGCGACA CGGACGCGGA AATCCTTTAC GAGCTGCTGA CCCAGCACTG GCACCTGARA 420 

ACACCCAACC TGGTCATTTC TGTGACCGGG GGCGCCAAGA ACTTCGCCCT GAAGCCGCGC 480 

ATGCGCAAGA TCTTCAGCCG GCTCATCTAC ATCGCGCAGT CCAAAGGTGC TTGGATTCTC 540 

ACGGGAGGCA CCCATTATGG CCTGATGAAG TACATCGGGG AGGTGGTGAG AGATAACACC 600 

65 ATCAGCAGGA GTTCAGAGGA GAATATTGTG GCCATTGGCA TAGCAGCTTG GGGCATGGTC 660 

TCCAACCGGG ACACCCTCAT CAGGAATTGC GATGCTGAGG GCTATTTTTT AGCCCAGTAC 720 

CTTATGGATG ACTTCACAAG AGATCCACTG TATATCCTGG ACAACAACCA CACACATTTG 780 

CTGCTCGTGG ACAATGGCTG TCATGGACAT CCCACTGTCG AAGCAAAGCT CCGGAATCAG 840 

__. CTAGAGAAGT ATATCTCTGA GCGCACTATT CAAGATTCCA ACTATGGTGG CAAGATCCCC 900 

70 ATTGTGTGTT TTGCCCAAGG AGGTGGAAAA GAGACTTTGA AAGCCATCAA TACCTCCATC 960 

AAAAATAAAA TTCCTTGTGT GGTGGTGGAA GGCTCGGGCC AGATCGCTGA TGTGATCGCT 1020 

AGCCTGGTGG AGGTGGAGGA TGCCCTGACA TCTTCTGCCG TCAAGGAGAA GCTGGTGCGC 1080 

TTTTTACCCC GCACGGTGTC CCGGCTGCCT GAGGAGGAGA CTGAGAGTTG GATCAAATGG 1140 

__, CTCAAAGAAA TTCTCGAATG TTCTCACCTA TTAACAGTTA TTAAAATGGA AGAAGCTGGG 1200 

75 GATGAAATTG TGAGCAATGC CATCTCCTAC GCTCTATACA AAGCCTTCAG CACCAGTGAG 1260 

CAAGACAAGG ATAACTGGAA TGGGCAGCTG AAGCTTCTGC TGGAGTGGAA CCAGCTGGAC 1320 

TTAGCCAATG ATGAGATTTT CACCAATGAC CGCCGATGGG AGTCTGCTGA CCTTCAAGAA 1380 

GTCATGTTTA CGGCTCTCAT AAAGGACAGA CCCAAGTTTG TCCGCCTCTT TCTGGAGAAT 1440 

GGCTTGAACC TACGGAAGTT TCTCACCCAT GATGTCCTCA CTGAACTCTT CTCCAACCAC 1500 

80 TTCAGCACGC TTGTGTACCG GAATCTGCAG ATCGCCAAGA ATTCCTATAA TGATGCCCTC 1560 
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CTCACGTTTG TCTGGAAACT GGTTGCGAAC TTCCGAAGAG GCTTCCGGAA GG AAGACAGA 1620 

AATGGCCGGG ACGAGATGGA CATAGAACTC CACGACGTGT CTCCTATTAC TCGGCACCCC 1680 

CTGCAAGCTC TCTTCATCTG GGCCATTCTT CAGAATAAGA AGGAACTCTC CAAAGTCATT 1740 

TGGGAGCAGA CCAGGGGCTG CACTCTGGCA GCCCTGGGAG CCAGCAAGCT TCTGAAGACT 1800 

CTGGCCAAAG TGAAGAACGA CATCAATGCT GCTGGGGAGT CCGAGGAGCT GGCTAATGAG 1860 

TACGAGACCC GGGCTGTTGA GCTGTTCACT GAGTGTTACA GCAGCGATGA AGACTTGGCA 1920 

GAACAGCTGC TGGTCTATTC CTGTGAAGCT TGGGGTGGAA GCAACTGTCT GGAGCTGGCG 1980 

GTGGAGGCCA CAGACCAGCA TTTCATCGCC CAGCCTGGGG TCCAGAATTT TCTTTCTAAG 2040 

CAATGGTATG GAGAGATTTC CCGAGACACC AAGAACTGGA AGATTATCCT GTGTCTGTTT 2100 

ATTATACCCT TGGTGGGCTG TGGCTTTGTA TCATTTAGGA AGAAACCTGT CGACAAGCAC 2160 

AAGAAGCTGC TTTGGTACTA TGTGGCGTTC TTCACCTCCC CCTTCGTGGT CTTCTCCTGG 2220 

AATGTGGTCT TCTACATCGC CTTCCTCCTG CTGTTTGCCT ACGTGCTGCT CATGGATTTC 2280 

CATTCGGTGC CACACCCCCC CGAGCTGGTC CTGTACTCGC TGGTCTTTGT CCTCTTCTGT 2340 

GATGAAGTGA GACAGTGGTA CGTAAATGGG GTGAATTATT TTACTGACCT GTGGAATGTG 2400 

ATGGACACGC TGGGGCTTTT TTACTTCATA GCAGGAATTG TATTTCGGCT CCACTCTTCT 2460 

AATAAAAGCT CTTTGTATTC TGGACGAGTC ATTTTCTGTC TGGACTACAT TATTTTCACT 2520 

CTAAGATTGA TCCACATTTT TACTGTAAGC AGAAACTTAG GACCCAAGAT TATAATGCTG 2580 

CAGAGGATGC TGATCGATGT GTTCTTCTTC CTGTTCCTCT TTGCGGTGTG GATGGTGGCC 2640 

TTTGGCGTGG CCAGGCAAGG GATCCTTAGG CAGAATGAGC AGCGCTGGAG GTGGATATTC 2700 

CGTTCGGTCA TCTACGAGCC CTAOCTGGCC ATGTTCGGCC AGGTGCCCAG TGACGTGGAT 2760 

GGTACCACGT ATGACTTTGC CCACTGCACC TTCACTGGGA ATGAGTCCAA GCCACTCTGT 2820 

GTGGAGCTGG ATGAGCACAA CCTGCCCCGG TTCCCCGAGT GGATCACCAT CCCCCTGGTC 2880 

TGCATCTACA TGTTATCCAC CAACATCCTG CTGGTCAACC TGCTGGTCGC CATGTTTGGC 2940 

TACACGGTGG GCACCGTCCA GGAGAACAAT GACCAGGTCT GGAAGTTCCA GAGGTACTTC 3000 

CTGGTGCAGG AGTACTGCAG CCGCCTCAAT ATCCCCTTCC CCTTCATCGT CTTCGCTTAC 3060 

TTCTACATGG TGGTGAAGAA GTGCTTCAAG TGTTGCTGCA AGGAGAAAAA CATGGAGTCT 3120 

TCTGTCTGCT GTTTCAAAAA TGAAGACAAT GAGACTCTGG CATGGGAGGG TGTCATGAAG 3180 

GAAAACTACC TTGTCAAGAT CAACACAAAA GCCAACGACA CCTCAGAGGA AATGAGGCAT 3240 

CGATTTAGAC AACTGGATAC AAAGCTTAAT GATCTCAAGG GTCTTCTGAA AGAGATTGCT 3300 
AATAAAATCA AATGA 

SEQ ID NO^fBHlfEteSSflyfiSfi 
Protein Accession #: XP_017718 

1 11 21 31 41 51 

I I I I I I 

MSFRAARLSM RNRRNDTLDS TRTLYSSASR STDLSYSESD LVNFIQANFK KRECVFFTKD 60 

SKATENVCKC GYAQSQHMEG TQINQSEKWN YKKRTKEFPT DAFGDIQFET LGKKGKYIRL 120 

SCDTDAEILY ELLTQHWHLK TPNLVISVTG GAKNFALKPR MRXIFSRLIY IAQSKGAWIL 180 

TGGTHYGLKK YIGEWRDNT ISRSSEENIV AIGIAAWGMV SNRDTLIRNC DAEGYFLAQY 240 

LMDDFTRDPL YILDNNHTHL LLVDNGCHGH PTVEAKLRNQ LEKYISERTI QDSNYGGKIP 300 

IVCFAQGGGK ETLKAINTSI KNKIPCVWE GSGQIADVXA SLVEVEDALT SSAVKEKLVR 360 

FLPRTVSRLP EEETESWIKW LKEILECSHL LTVJKHEEAG DEIVSNAISY ALYKAFSTSE 420 

QDKDNWNGQL KLLLEWNQLD LANDEIFTND RRWESADLQE VHFTALIKDR PKFVRLFLEN 480 

GLNLRKFLTH DVLTELFSNH FSTLVYRNLQ IAKNSYNDAL LTFVWKLVAN PRRGFRKEDR 540 

NGKDEMDIEL HDVSPITRHP LQALFIWAIL QNKKELSKVI WEQTRGCTLA ALGASKLLKT 600 

LAKVKNDIKA AGESEELANE YETRAVELFT ECYSSDEDLA EQLLVYSCEA WGGSNCLELA 660 

VEATDQHFIA QPGVQNFLSK QWYGEISRDT KNWKIILCLF IIPLVGCGFV SFRKKPVDKH 720 

KKLLWYYVAF FTSPFWFSW NWFYXAFLL LFAYVLLMDF HSVPHPPELV LYSLVFVLFC 780 

PEVRQWYVNG VNYFTDLWNV MDTLGLFYFI AGXVFRLHSS NKSSLYSGRV IFCLDYIIFT 840 

LRLIHTFTVS RNLGPKIH4L QRMLIDVFFF LFLFAVWMVA FGVAROGHjR QNEQRWPJHIF 900 

RSVIYEPYLA MFGQVPSDVD GTTYDFAHCT FTGNESKPLC VELDEBNLPR FPEWITIPLV 960 

CIYMLSTNIL LVKLLVAMFG YTVGTVQENN DQVWKFQRYF LVQEYCSRLN IPFPFIVFAY 1020 

FYMWKKCFK CCCKEKNMES SVCCFKNEDN ETLAWEGVKK ENYLVKINTK ANDTSEEMRH 1080 
RFRQLDTKLN DLKGLLKEIA NKIK 



SEQ 10 NO:39 PBH3 DNA SEQUENCE 

Nucleic Acid Accession*: XMJH1804 

Coding sequence: 1*558 {underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I 1 I I 

ATGCCTCGCC TGTTCTTGTT CCACCTGCTA GAATTCTGTT TACTACTGAA CCAATTTTCC 60 

AGAGCAGTCG CGGCCAAATG GAAGGACGAT GTTATTAAAT TATGCGGCCG CGAATTAGTT 120 

CGCGCGCAGA TTGCCATTTG CGGCATGAGC ACCTGGAGCA AAAGGTCTCT GAGCCAGGAA 180 

GATGCTCCTC AGACACCTAG ACCAGTGGCA GAAATTGTAC CATCCTTCAT CAACAAAGAT 240 

ACAGAAACTA TAATTATCAT GTTGGAATTC ATTGCTAATT TGCCACCGGA GCTGAAGGCA 300 

GCCCTATCTG AGAGGCAACC ATCATTACCA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

GATTCCAATC TTAGCTTTGA AGAATTTAAG AAACTTATTC GCAATAGGCA AAGTGAAGCC 420 

GCAGACAGCA ATCCTTCAGA ATTAAAATAC TTAGGCTTGG ATACTCATTC TCAAAAAAAG 480 

AGACGACCCT ACGTGGCACT GTTTGAGAAA TGTTGCCTAA TTGGTTGTAC CAAAAGGTCT 540 
CTTGCTAAAT ATTGCTGA 

SEP ID NO:4Q PBH3 PROTEIN SEQUENCE 

Protein Accession #: NPJXW842 

1 11 21 31 41 51 

MPRLFLFHLL EFCLLLNQFS RAVAAKWKDD VIKLCGRELV RAQIAICGMS TWSKRSLSQE 60 
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DAPQTPRPVA EIVPSFINKD TETIIIMLEF IANLPPELKA ALSERQPSLP ELQQYVPALK 120 

DSNLSFEBFK KLIRNRQSEA ADSNPSELKY LGLDTHSQKK RRPYVALFEK CCLIGCTKRS 180 
LAKYC 

SEQ ID N0:41 PBH5 DNA SEQUENCE 

Mudeic Arid Accession!: NMJD05845 

Coding sequence: 1-3978 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGCTGCCCG TGTACCAGGA GGTGAAGCCC AACCCGCTGC AGGACGCGAA CCTCTGCTCA 60 

CGCGTGTTCT TCTGGTGGCT CAATCCCTTG TTTAAAATTG GCCATAAACG GAGATTAGAG 120 

GAAGATGATA TGTATTCAGT GCTGCCAGAA GACCGCTCAC AGCACCTTGG AGAGGAGTTG 180 

CAAGGGTTCT GGGATAAAGA AGTTTTAAGA GCTGAGAATG ACGCACAGAA GCCTTCTTTA 240 

ACAAGAGCAA TCATAAAGTG TTACTGGAAA TCTTATTTAG TTTTGGGAAT TTTTACGTTA 300 

ATTGAGGAAA GTGCCAAAGT AATCCAGCCC ATATTTTTGG GAAAAATTAT TAATTATTTT 360 

GAAAATTATG ATCCCATGGA TTCTGTGGCT TTGAACACAG CGTACGCCTA TGCCACGGTG 420 

CTGACTTTTT GCACGCTCAT TTTGGCTATA CTGCATCACT TATATTTTTA TCACGTTCAG 480 

TGTGCTGGGA TGAGGTTACG AGTAGCCATG TGCCATATGA TTTATCGGAA GGCACTTCGT 540 

CTTAGTAACA TGGCCATGGG GAAGACAACC ACAGGCCAGA TAGTCAATCT GCTGTCCAAT 600 

GATGTGAACA AGTTTGATCA GGTGACAGTG TTCTTACACT TCCTGTGGGC AGGACCACTG 660 

CAGGCGATCG CAGTGACTGC CCTACTCTGG ATGGAGATAG GAATATCGTG CCTTGCTGGG 720 

ATGGCAGTTC TAATCATTCT CCTGCCCTTG CAAAGCTGTT TTGGGAAGTT GTTCTCATCA 780 

CTGAGGAGTA AAACTGCAAC TTTCACGGAT GCCAGGATCA GGACCATGAA TGAAGTTATA 840 

ACTGGTATAA GGATAATAAA AATGTACGCC TGGGAAAAGT CATTTTCAAA TCTTATTACC 900 

AATTTGAGAA AGAAGGAGAT TTCCAAGATT CTGAGAAGTT CCTGCCTCAG GGGGATGAAT 960 

TTGGCTTCGT TTTTCAGTGC AAGCAAAATC ATCGTGTTTG TGACCTTCAC CACCTACGTG 1020 

CTCCTCGGCA GTGTGATCAC AGCCAGCCGC GTGTTCGTGG CAGTGACGCT GTATGGGGCT 1080 

GTGCGGCTGA CGGTTACCCT CTTCTTCCCC TCAGCCATTG AGAGGGTGTC AGAGGCAATC 1140 

GTCAGCATCC GAAGAATCCA GACCTTTTTG CTACTTGATG AGATATCACA GCGCAACCGT 1200 

CAGCTGCCGT CAGATGGTAA AAAGATGGTG CATGTGCAGG ATTTTACTGC • TTTTTGGGAT 1260 

AAGGCATCAG AGACCCCAAC TCTACAAGGC CTTTCCTTTA CTGTCAGACC TGGCGAATTG 1320 

TTAGCTGTGG TCGGCCCCGT GGGAGCAGGG AAGTCATCAC TGTTAAGTGC CGTGCTCGGG 1380 

GAATTGGCCC CAAGTCACGG GCTGGTCAGC GTGCATGGAA GAATTGCCTA TGTGTCTCAG 1440 

CAGCCCTGGG TGTTCTCGGG AACTCTGAGG AGTAATATTT TATTTGGGAA GAAATACGAA 1500 

AAGGAACGAT ATGAAAAAGT CATAAAGGCT TGTGCTCTGA AAAAGGATTT ACAGCTGTTG 1560 

GAGGATGGTG ATCTGACTGT GATAGGAGAT CGGGGAACCA CGCTGAGTGG AGGGCAGAAA 1620 

GCACGGGTAA ACCTTGCAAG AGCAGTGTAT CAAGATGCTG ACATCTATCT CCTGGACGAT 1680 

CCTCTCAGTG CAGTAGATGC GGAAGTTAGC AGACACTTGT TCGAACTGTG TATTTGTCAA 1740 

ATTTTGCATG AGAAGATCAC AATTTTAGTG ACTCATCAGT TGCAGTACCT CAAAGCTGCA 1800 

AGTCAGATTC TGATATTGAA AGATGGTAAA ATGGTGCAGA AGGGGACTTA CACTGAGTTC 1860 

CTAAAATCTG GTATAGATTT TGGCTCCCTT TTAAAGAAGG ATAATGAGGA AAGTGAACAA 1920 

CCTCCAGTTC CAGGAACTCC CACACTAAGG AATCGTACCT TCTCAGAGTC TTCGGTTTGG 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCTC TGGAGAGCCA AGATACAGAG 2040 

AATGTCCCAG TTACACTATC AGAGGAGAAC CGTTCTGAAG GAAAAGTTGG TTTTCAGGCC 2100 

TATAAGAATT ACTTCAGAGC TGGTGCTCAC TGGATTGTCT TCATTTTCCT TATTCTCCTA 2160 

AACACTGCAG CTCAGGTTGC CTATGTGCTT CAAGATTGGT GGCTTTCATA CTGGGCAAAC 2220 

AAACAAAGTA TGCTAAATGT CACTGTAAAT GGAGGAGGAA ATGTAACCGA GAAGCTAGAT 2280 

CTTAACTGGT ACTTAGGAAT TTATTCAGGT TTAACTGTAG CTACCGTTCT TTTTGGCATA 2340 

GCAAGATCTC TATTGGTATT CTACGTCCTT GTTAACTCTT CACAAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCTGAA AGCTCCGGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GTTTCTCCAA AGACATTGGA CACTTGGATG ATTTGCTGCC GCTGACGTTT 2520 

TTAGATTTCA TCCAGACATT GCTACAAGTG GTTGGTGTGG TCTCTGTGGC TGTGGCCGTG 2580 

ATTCCTTGGA TCGCAATACC CTTGGTTCCC CTTGGAATCA TTTTCATTTT TCTTCGGCGA 2640 

TATTTTTTGG AAACGTCAAG AGATGTGAAG CGCCTGGAAT CTACAACTCG GAGTCCAGTG 2700 

TTTTCCCACT TGTCATCTTC TCTCCAGGGG CTCTGGACCA TCCGGGCATA CAAAGCAGAA 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAGGATTTAC ATTCAGAGGC TTGGTTCTTG 2820 

TTTTTGACAA CGTCCCGCTG GTTCGCCGTC CGTCTGGATG CCATCTGTGC CATGTTTGTC 2880 

ATCATCGTTG CCTTTGGGTC CCTGATTCTG GCAAAAACTC TGGATGCCGG GCAGGTTGGT 2940 

TTGGCACTGT CCTATGCCCT CACGCTCATG GGGATGTTTC AGTGGTGTGT TCGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGTCA TTGAATACAC AGACCTTGAA 3060 

AAAGAAGCAC CTTGGGAATA TCAGAAACGC CCACCACCAG CCTGGCCCCA TGAAGGAGTG 3120 

ATAATCTTTG ACAATGTGAA CTTCATGTAC AGTCCAGGTG GGCCTCTGGT ACTGAAGCAT 3180 

CTGACAGCAC TCATTAAATC ACAAGAAAAG GTTGGCATTG TGGGAAGAAC CGGAGCTGGA 3240 

AAAAGTTCCC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC CCGAAGGTAA AATTTGGATT 3300 

GATAAGATCT TGACAACTGA AATTGGACTT CACGATTTAA GGAAGAAAAT GTCAATCATA 3360 

CCTCAGGAAC CTGTTTTGTT CACTGGAACA ATGAGGAAAA ACCTGGATCC CTTTAATGAG 3420 

CACACGGATG AGGAACTGTG GAATGCCTTA CAAGAGGTAC AACTTAAAGA AACCATTGAA 3480 

GATCTTCCTG GTAAAATGGA TACTGAATTA GCAGAATCAG GATCCAATTT TAGTGTTGGA 3540 

CAAAGACAAC TGGTGTGCCT TGCCAGGGCA ATTCTCAGGA AAAATCAGAT ATTGATTATT 3600 

GATGAAGCGA CGGCAAATGT GGATCCAAGA ACTGATGAGT TAATACAAAA AAAAATCCGG 3660 

GAGAAATTTG CCCACTGCAC CGTGCTAACC ATTGCACACA GATTGAACAC CATTATTGAC 3720 

AGCGACAAGA TAATGGTTTT AGATTCAGGA AGACTGAAAG AATATGATGA GCCGTATGTT 3780 

TTGCTGCAAA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTGGG CAAGGCAGAA 3840 

GCCGCTGCCC TCACTGAAAC AGCAAAACAG GTATACTTCA AAAGAAATTA TCCACATATT 3900 

GGTCACACTG ACCACATGGT TACAAACACT TCCAATGGAC AGCCCTCGAC CTTAACTATT 3960 
TTCGAGACAG CACTGTGA 
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SEP ID NO:42 PBH5 PROTEIN SEQUENCE 

Protein Accession #: NPJKJ5836 

1 11 21 31 41 51 

ill III 

MLPVYQEVKP NPLQDANLCS RVFFWWLNPL FKIGHKRRLE EDDMYSVLPE DRSQHLGEEL 60 

QGFWDKEVLR AENDAQKPSL THAIIKCYWK SYLVLGIFTL IEESAKVIQP IFLGKUNYF 120 

ENYDPMDSVA LNTAYAYATV LTFCTLILAI LHHLYFYHVQ CAGMRLRVAM CHMIYRKALR 180 

LSNMAMGKTT TGQIVNLLSN DVNKFDQVTV FLHFLWAGPL QAIAVTALLW MEIGISCLAG 240 

MAVLIILLPL QSCFGKLFSS LRSKTATFTD ARIRTMNEVI TGIRIIKMYA WEKSFSNLIT 300 

NLRKKEISKI LRSSCLRGMN LASFFSASKI IVFVTFTTYV LLGSVITASR VFVAVTLYGA 360 

VRLTVTLFFP SAIERVSEAI VSIRRIQTFL LLDEISQRNR QLPSDGKKMV HVQDFTAFWD 420 

KASETPTLQG LSFTVRPGEL LAWGFVGAG KSSLLSAVLG ELAPSHGLVS VHGRIAYVSQ 480 

QPWVFSGTLR SNILFGKKYE KERYEKVIKA CALKKDLQLL EDGDLTVIGD RGTTLSGGQK 540 

ARVNLARAVY QDADIYLLDD PLSAVDAEVS RHLFELCICQ ILHEKITILV THQLQYLKAA 600 

SQILILKDGK MVQKGTYTEF LKSGIDFGSL LKKDNEESEQ PPVPGTPTLR NRTFSESSVW 660 

SQQSSRPSLK DGALESQDTE NVPVTLSEEN RSEGKVGFQA YKNYFRAGAH WTVFIFLILL 720 

NTAAQVAYVL QDWWLSYWAN KQSMLNVTVN GGGNVTEKLD LNWYLGIYSG LTVATVLFGI 780 

ARSLLVFYVL VNSSQTLHNK MFESILKAPV LFFDRNPIGR ILNRFSKDIG HLDDLLPLTF 640 

LDFIQTLLQV VGWSVAVAV IPWIAIPLVP LGIIFIFLRR YFLETSRDVK RLESTTRSFV 900 

FSELSSSLQG LWTIRAYKAE ERCQELFDAH QDLHSEAWFL FLTTSRWFAV RLDAICAMFV 960 

ITVAFGSLIL AKTLDAGQVG LALSYALTLM GMFQWCVRQS AEVENMMISV ERVIEYTDLE 1020 

KEAPWEYQKR PPPAWPHEGV IIFDNVNFMY SPGGPLVLKH LTALIKSQEK VGXVGRTGAG 1080 

KSSLISALFR LSEPEGKIWI DKILTTEIGL HDLRKKMSII PQEFVLFTGT MRKNLDPFNE 1140 

HTDEELWNAL QEVQLKETIE DLFGKMDTEL AESGSNFSVG QRQLVCLARA ILRKNQILII 1200 

DEATANVDPR TDELIQKKIR EKFAHCTVLT IAHRLNTIID SDKIMVLDSG RLKEYDEPYV 1260 

LLQNKESLFY KMVQQLGKAE AAALTETAKQ VYFKRNYPHI GHTDHMVTNT SNGQPSTLTI 1320 
FETAL 



SEQ ID NO:43 P6Q7 ONA SEQUENCE 

Nucleic Acid Accession*: NMJJ21233 

Coding sequence: 34-1 1 19 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

III III 

ATGGGGAAAG TGTCCTGCTG TGGCATGAAA TAAATGAAAC AGAAAATGAT GGCAAGACTG 60 

CTAAGAACAT CCTTTGCTTT GCTCTTCCTT GGCCTCTTTG GGGTGCTGGG GGCAGCAACA 120 

ATTTCATGCA GAAATGAAGA AGGGAAAGCT GTGGACTGGT TTACTTTTTA TAAGTTACCT 180 

AAAAGACAAA ACAAGGAAAG TGGAGAGACT GGGTTAGAGT ACCTGTACCT AGACTCTACA 240 

ACTAGAAGCT GGAGGAAGAG TGAGCAACTA ATGAATGACA CCAAGAGTGT TTTGGGAAGG 300 

ACATTACAAC AGCTATATGA AGCATATGCC TCTAAGAGTA ACAACACAGC CTATCTAATA 360 

TACAATGATG GAGTCCCTAA ACCTGTGAAT TACAGTAGAA AGTATGGACA CACCAAAGGT 420 

TTACTGCTGT GGAACAGAGT TCAAGGGTTC TGGCTGATTC ATTCCATCCC TCAGTTTCCT 480 

CCAATTCCGG AAGAAGGCTA TGATTATCCA CCCACAGGGA GACGAAATGG ACAAAGTGGC 540 

ATCTGCATAA CTTTCAAGTA CAACCAGTAT GAGGCAATAG ATTCTCAGCT CTTGGTCTGC 600 

AACCCCAACG TCTATAGCTG CTCCATCCCA GCCACCTTTC ACCAGGAGCT CATTCACATG 660 

CCCCAGCTGT GCACCAGGGC CAGCTCATCA GAGATTCCTG GCAGGCTCCT CACCACACTT 720 

CAGTCGGCCC AGGGACAAAA ATTCCTCCAT TTTGCAAAGT CGGATTCTTT TCTTGACGAC 780 

ATCTTTGCAG CCTGGATGGC TCAACGGCTG AAGACACACT TGTTAACAGA AACCTGGCAG 840 

CGAAAAAGAC AAGAGCTTCC TTCAAACTGC TCCCTTCCTT ACCATGTCTA CAATATAAAA 900 

GCAATTAAAT TATCACGACA CTCTTATTTC AGTTCTTATC AAGATCACGC CAAGTGGTGT 960 

ATTTCCCAAA AGGGCACCAA AAATCGCTGG ACATGTATTG GAGACCTAAA TCGGAGTCCA 1020 

CACCAAGCCT TCAGAAGTGG AGGATTCATT TGTACCCAGA ATTGGCAAAT TTACCAAGCA 1080 
TTTCAAGGAT TAGTATTATA CTATGAAAGC TGTAA GTAAA CTTGGTGAAA GGACACAGGT 

SEQ ID NCH44 PBQ7 Protein sequence 
Protein Accession*: NP.067056 

1 11 21 31 41 51 

I II I I I 

MMARLLRTSF ALLFLGLFGV LGAATISCRN EEGKAVDWFT FYKLPKRQNK ESGETGLEYL 60 
YLDSTTRSWR KSEQLKNDTK SVLGRTLQQL YEAYASKSNN TAYLIYNDGV PKPVNYSRKY 120 
GHTKGLLLWN RVQGFWLIHS IPQFPPIPEE GYDYPPTGRR NGQSGICITF KYNQYEAIDS 180 
QLLVCNPNVY SCSIPATFHQ ELIHMPQLCT RASSSEIPGR LLTTLQSACG QKFLHFAKSD 240 
SFLDDIFAAW MAQRLKTHLL TETWQRKRQE LPSNCSLPYH VYNIKAIKLS RHSYFSSYQD 300 
HAKWCISQKG TKNRWTCIGD LNRSPHQAFR SGGFICTQNW QIYQAFQGLV LYYESCK 

SEQ ID N0:45 PCG9 DNA SEQUENCE 

Nucleic Acid Accession*: XNC030453 

Coding sequence: 89-1273 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

1 I I I I I 

CGGTGCCCTG GGGTGGAATA TCCCCTACGA ATTTAACCAA GCGGACTTTA ATGCCACTGT 60 

GCAGTTCATC CAAAACCACT TGGATGACAT GGATGTCAAA AAGGGTGTCT CCTGGACCAC 120 

CATCCGCTAC ATGATAGGAG AGATTCAATA TGGAGGCAGA GTCACTGACG ACTATGATAA 180 

GAGATTGTTG AACACATTTG CTAAGGTTTG GTTCAGTGAA AATATGTTTG GACCAGATTT 240 

CAGTTTTTAC CAAGGATACA ATATTCCAAA ATGCAGCACA GTGGATAACT ATCTTCAGTA 300 

TATCCAGAGT TTGCCTGCCT ATGACAGCCC TGAGGTGTTT GGGCTGCACC CCAATGCTGA 360 
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CATCACCTAC CAGAGCAAGC TGGCCAAGGA CGTGCTGGAC ACCATCCTAG GCATCCAACC 420 

CAAGGACACC TCTGGTGGAG GGGATGAGAC CCGGGAGGCG GTGGTGGCCC GGCTGGCTGA 480 

TGATATGCTG GAGAAGCTGC CCCCAGACTA TGTCCCCTTT GAAGTAAAAG AGAGGCTGCA 540 

GAAGATGGGG CCATTCCAGC CTATGAACAT TTTCCTCAGG CAGGAAATAG ACAGAATGCA 600 

AAGGGTACTC AGCCTTGTCC GCAGCACCCT CACTGAGCTG AAACTTGCTA TTGATGGCAC 660 

CATCATCATG AGCGAAAATC TGCAAGATGC ATTGGATTGC ATGTTTGATG CTAGAATCCC 720 

TGCTTGGTGG AAAAAAGCTT CTTGGGTTTT TAGTACACTG GGTTTCTGGT TTACTGAACT 780 

TATAGAAAGA AACAGCCAGT TTACCTCGTG GGTTTTCAAT GGCCGACCTC ACTGCTTTTG 840 

GATGACGGGT TTTTTTAACC CCCAGGGATT TTTAACTGCA ATGCGACAGG AAATAACTCG 900 

GGCCAACAAA GGCTGGGCTC TGGACAATAT GGTGCTTTGC AATGAAGTCA CCAAATGGAT 960- 

GAAGGACGAC ATTTCTACCC CTCCCACAGA GGGTGTCTAT GTCTATGGCT TATATCTTGA 1020 • 

AGGTGCTGGC TGGGACAAGA GGAACATGAA ACTCATTGAA TCAAAGCCAA AAGTGCTCTT 1080 

TGAGTTGATG CCTGTCATAA GGATTTATGC AGAAAACAAT ACTTTACGAG ATCCTCGGTT 1140 

TTACTCCTGT CCCATCTATA AGAAGCCAGT TCGAACGGAC TTGAACTACA TTGCCGCTGT 1200 

GGATCTCAGG ACAGCCCAGA CCCCTGAACA CTGGGTGCTC CGTGGGGTTG CCCTTCTGTG 1260 

TGATGTCAAG TAACATGTGG GGAGTGTCCC CACCCAATGC TTTGGAAAAT GCAAGATCTA 1320 

AATTATTGTA ACCTTTATTT CTGTATGACT GCTGGACAGT GTATGTTAGG TCGTTTATGC 1380 

AATTAATGAG CTGCATAGGT TTTCCCCACT CCTTAATTGG ATGCTTATAT TTTACTTGTT 1440 

TCATCATTAG TGACCAATGT CTGAGTTTGT TGAAAATGTT ATTTAGTGAT ATAAAAGTAA 1500 

ATTTACAGCA TCCTAATGAA GTGTGGCCCT CAAATCCACA GTAGTATATT TTCTTCTTAC 1560 

TTCGCTCCGA AGACTGACTG TGATTATAAC AGCAAATATA TTTGCATGTG GACAAAGATT 1620 

AGATGGCAAG ATAGAAAAAT AAGAACAGAT GTGATAGCAA GAATTATAGT TGGCTTGAAA 1680 

AAATGTGATG ATCAGGAGAA AAAATAAAAA AAGGGTAGAA ATATTAGACG GTGCGTAGGG 1740 

ACTTTCTATG GACTTTTATT AATTAGGAAA CATTATCAAA GGAACTTTTC ACGTATTTTT 1800 

CTTTAAATTC TGGTTAGATG TTATTAATAA TTCTTCATCT AACCTACTGA CTAGAAAATA 1860 

TAGTCAGTAC TAAATTAGAA TTGTGGTTTA TAAACTTTTG GTTAGCTCTG GATCTGTATA 1920 

ACTGCATTTT TTTGGATAAA CAGTTTTTGG TAGGTGGATA CCGGGAGACA AGTGTGGGTC 1980 

CCTCTCACTG GGCTTCATTC TGTGGACCAG GATCATTATT TCATGCTCAT GATCATGAGA 2040 

GTTAGGACTG AGTGGCTCCT GTGACTCCCA CCATCTTAGA TGATACTGTT TTCTTGTGAG 2100 

TTCTTTCTTT TGGTGTGGAT TAGTATATCA GTTGATTTGT GTGAATTGTG GTGAAACAAT 2160 

CATTTCATTT TGAAAAGCAA GTAATGAAAA TGTCAGCATC ATAGGAATTA ATAAAATGTT 2220 
TTTACTAAAA AAAAAAAAAA AAA 

SEQ ID NO:46 PCQ8 Protein sequence 
Protein Accession*: BAB 15543 

1 11 21 31 41 51 

I I 1 I I I 

MDVKKGVSWT TIRYMIGEIQ YGGRVTDDYD KRLLNTFAKV WFSENMFGPD FSFYQGYNIP 60 

KCSTVDNYLQ YIQSLPAYDS PEVFGLHPNA DITYQSKLAK DVLDTILGIQ PKDTSGGGDE 120 

TREAWARLA DDMLEKLPPD YVPFEVKERL QKMGFFQPMN IFLRQEIDRM QRVLSLVRST 180 

LTELKLAIDG TIIMSENLQD ALDCMFDARI PAWWKKASWV FSTLGFWFTE LIERNSQFTS 240 

WVFNGRPHCF WMTGFFNPQG FLTAMRQEIT RANKGWALDN MVLCNEVTKW MKDDISTPPT 300 

EGVYVYGLYL EGAGWDKRNM KLIESKPKVL FELMPVIRIY AENNTLRDPR FYSCPIYKKP 360 
VRTDLNYIAA VDLRTAQTPE HWVLRGVAIX CDVK 

SEQ ID NO:47 PDG5 DMA S EQUENCE 

Nudeic Acid Accession* AB033O36 

Coding sequence: 68-3349 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGAGCAGCCT ACAACTTCAC AACCAGAAAC CACTACCCCT CAGGGGTTGC TTTCAGATAA 60 

AGATGACATG GGAAGGAGAA ATGCTGGCAT AGATTTCGGA TCCAGAAAAG CATCAGCAGC 120 

ACAGCCCATA CCTGAAAACA TGGACAATTC CATGGTTAGT GATCCACAAC CATACCATGA . 180 

AGATGCAGCT TCTGGAGCTG AGAAGACAGA AGCCAGAGCT TCTCTCTCAC TGATGGTGGA 240 

AAGCCTTTCT ACAACCCAAG AGGAGGCCAT TCTCTCAGTA GCAGCAGAGG CTCAGGTGTT 300 

TATGAATCCT TCTCATATCC AGTTAGAAGA TCAAGAAGCT TTCAGCTTTG ATTTACAAAA 360 

GGCCCAATCC AAAATGGAGT CAGCCCAGGA TGTTCAAACT ATCTGCAAAG AAAAGCCTTC 420 

TGGAAATGTT CACCAGACCT TTACAGCAAG TGTTTTGGGT ATGACAAGTA CTACAGCCAA 480 

AGGAGATGTT TATGCCAAGA CTCTGCCTCC CAGAAGCCTT TTTCAGTCCT CAAGGAAGCC 540 

TGATGCTGAA GAAGTCTCCT CAGATTCAGA GAATATTCCT GAGGAGGGGG ATGGTTCTGA 600 

AGAACTGGCT CATGGTCACT CTTCCCAGTC CTTGGGGAAG TTTGAAGATG AACAAGAAGT 660 

CTTCTCAGAA TCAAAAAGTT TTGTTGAGGA CTTGAGCAGC TCTGAGGAGG AGCTGGACCT 720 

CAGATGCCTC TCCCAGGCTT TAGAGGAGCC TGAAGATGCA GAAGTCTTCA CAGAATCAAG 780 

CAGTTATGTT GAAAAGTACA ACACTTCTGA TGATTGCAGC AGCTCAGAGG AAGACCTGCC 840 

TCTCAGACAC CCTGCTCAGG CCTTGGGAAA GCCCAAAAAC CAACAAGAAG TCTCCTCTGC 900 

TTCAAATAAT ACTCCTGAAG AGCAGAATGA TTTTATGCAG CAGCTGCCTT CCAGATGCCC 960 

TTCTCAGCCC ATTATGAATC CTACTGTTCA GCAACAAGTC CCCACCAGTT CAGTGGGCAC 1020 

TTCTATAAAA CAGAGCGATT CCGTGGAGCC AATCCCTCCA AGACACCCTT TCCAGCCATG 1080 

GGTGAACCCT AAAGTGGAGC AAGAAGTTTC CTCATCTCCA AAGAGCATGG CTGTTGAAGA 1140 

GAGCATTTCT ATGAAGCCTC TGCCTCCTAA ACTTCTTTGC CAGCCCTTGA TGAATCCTAA 1200 

AGTTCAACAA AACATGTTCT CAGGTTCAGA GGACATTGCT GTTGAGAGAG TCATTTCTGT 1260 

GGAGCCACTA CTCCCCAGAT ATTCTCCTCA GTCCTTGACA GATCCTCAAA TCCGGCAAAT 1320 

CTCAGAAAGC ACAGCTGTTG AGGAAGGCAC TTATGTGGAA CCGCTGCCTC CCAGATGCCT 1380 

TTCCCAGCCC TCGGAGAGGC CTAAGTTCCT GGACTCAATG AGTACTTCTG CAGAATGGAG 1440 

CAGTCCTGTG GCACCAACAC CTTCCAAATA CACTTCCCCG CCATGGGTGA CCCCTAAATT 1500 

TGAGGAACTG TATCAACTCT CTGCACATCC AGAAAGCACT ACTGTTGAAG AGGACATTTC 1560 

TAAGGAGCAG CTGCTTCCCA GACATCTTTC CCAGTTGACT GTGGGAAATA AAGTCCAGCA 1620 

ACTGTCCTCA AATTTCGAGC GGGCTGCTAT TGAGGCAGAC ATTTCTGGGA GTCCATTGCC 1680 
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TCCCCAATAT GCTACCCAGT TCTTAAAGAG GTCTAAAGTT CAGGAAATGA CCTCACGACT 1740 

AGAGAAAATG GCTGTTGAAG GCACTTCTAA CAAATCACCG ATTCCCAGGC GTCCGACCCA 1800 

GTCATTCGTG AAATTTATGG CACAGCAAAT CTTTTCAGAG AGCTCTGCTC TTAAGAGGGG 1860 

CAGTGATGTG GCACCTCTGC CTCCCAATCT TCCTTCCAAA TCTTTATCAA AGCCTGAAGT 1920 

CAAGCACCAA GTTTTCTCAG ATTCAGGGAG TGCTAATCCT AAGGGAGGCA TTTCTTCAAA 1980 

GATGCTACCT ATGAAGCACC CTTTACAGTC CTTGGGGAGG CCTGAAGACC CACAGAAAGT 2040 

TTTCTCTTAT TCAGAGAGAG CTCCTGGGAA GTGCAGCAGT TTTAAAGAGC AGCTGTCTCC 2100 

CAGGCAGCTT TCCCAGGCCT TGAGGAAACC TGAGTATGAG CAAAAAGTCT CCCCTGTTTC 2160 

TGCCAGTTCT CCTAAAGAGT GGAGGAATTC TAAAAAGCAG CTGCCTCCCA AACATTCTTC 2220 

CCAAGCCTCA GATAGGTCTA AATTCCAGCC ACAGATGTCA TCAAAGGGCC CAGTGAATGT 2280 

ACCTGTAAAG CAGAGCAGCG GTGAGAAGCA CCTGCCTTCA AGTAGTCCTT TCCAGCAACA 2340 

GGTTCATTCA AGTTCTGTGA ATGCTGCTGC TAGGCGATCT GTTTTTGAGA GCAATTCTGA 2400 

CAATTGGTTC CTAGGAAGAG ATGAAGCTTT TGCAATCAAA ACCAAGAAAT TCAGCCAAGG 2460 

TTCCAAAAAC CCCATAAAGA GCATTCCAGC CCCTGCTACC AAACCTGGGA AGTTCACCAT 2520 

TGCTCCTGTC AGGCAAACAT CCACTTCTGG GGGCATTTAC TCTAAGAAAG AAGATCTTGA 2580 

GAGTGGTGAT GGTAATAATA ACCAGCATGC AAACCTATCC AATCAGGATG ATGTTGAAAA 2640 

GCTTTTTGGA GTTCGACTGA AAAGAGCCCC TCCCTCGCAG AAGTATAAGA GTGAGAAACA 2700 

AGATAACTTC ACCCAGCTTG CTTCAGTGCC CTCGGGCCCA ATTTCATCCT CTGTAGGCAG 2760 

GGGACATAAA ATCAGAAGCA CTTCCCAGGG GCTCCTGGAT GCTGCAGGGA ACCTCACCAA 2820 

AATATCTTAC GTTGCAGATA AGCAACAGAG CAGGCCCAAA TCTGAAAGCA TGGCCAAGAA 2880 

GCAACCTGCT TGCAAGACCC CAGGAAAGCC TGCTGGTCAA CAGTCAGATT ATGCTGTCTC 2940 

AGAGCCGGTT TGGATAACTA TGGCAAAGCA GAAGCAGAAG AGTTTCAAGG CCCACATTTC 3000 

TGTGAAAGAG CTGAAAACTA AGAGCAATGC TGGAGCCGAT GCTGAGACTA AGGAGCCTAA 3060 

ATATGAGGGA GCTGGCTCTG CAAATGAAAA CCAACCTAAA AAGATGTTCA CTTCCAGTGT 3120 

CCATAAACAG GAGAAGACAG CACAGATGAA GCCACCTAAG CCTACAAAAT CAGTTGGATT 3180 

TGAAGCTCAG AAGATACTGC AAGTTCCTGC CATGGAAAAA GAAACCAAAC GATCTTCAAC 3240 

TCTCCCAGCC AAGTTCCAGA ACCCAGTTGA GCCAATTGAG CCTGTCTGGT TCTCACTGGC 3300 

CAGGAAGAAA GCCAAAGCAT GGAGCCACAT GGCAGAAATC ACGCAATAAA GAGCTCTTGT 3360 

GTGGAGCATC AGCATTTATT TTATTTAGTT TTTTTTTTTT ' n TTTT TO T GAGACAGAGT 3420 

CTCGCTCTGT TACCCAGATT GGAGTGCAGT GGCGCGATCT CCGCTCACTG CAAGCTCCGC 3480 

CTCCCGGGTT CACGCCACTC TCCCGCCTCA GTCTCCCGAC TAGCTGGGAC TACAGGCGCC 3540 

CGCCATCACG CCCGGCTAAT TTTGTTTTCG TATTTTTAGT AGAGACGGGG TTTCACCATG 3600 

TTGGCCAGGA TGGTCTTGAT CTCCTGACCT CGTGATCCGC CCGCCTCAGC CTCCCAAAAG 3660 

CTGGGATTAC AGGCGTGAGC CACCGCGCCC GGCCAAGCAT CAGCGTTTTA AATGATAATT 3720 

GCTAATAGCT GTATTAATTC TATGTAGTGA TCTTTTTACT GTGACCACTT GTATTAAGCA 3780 

AAATAAGTAT TAAGCAAACT AAGAATTTAT TAAGCAAAAT AAGAATTTAT TAAGCAAAAT 3840 

AGCCTTAGAA ATGCAAATTA AAACATAATT ATTTGAATGA AATAAATGCC ATGAATGCTT 3900 

AACCTTCCAC GTAGTCACTG CCAGCACCCA GAAACCCAGC ATTTCCTCTA TTAAAACTAT 3960 

CGAAAACATT TGCACTGCTG TAAAATTGCA AAATCTTTAA CTTTGGACAA TGTGCTTTAG 4020 

AAGGGAGAAA GCAAAAACAT TTTGTTGGAG CAACTAGAAA ATTGTCATTT CCCTCAACCA 4080 

AATAAAGTAA TTCTAATGGA AACATTCAGA TGATTTGACC TAAAGATTGG CCTTTAGGTT 4140 

TTATGAGCCT AGATAGATGC CGCAATTATT TGGTTGTTGC TCTAAGCTTT GCAAGGGATC 4200 

CTAAAAGAGG CGGTGGAAGT GAAAATTCTG GGTCTCCAAG AAAATTTCTG CACAGCCAGT 4260 

TCTCCAATCA GCCTATCACC CCTTGAAACA TCTTCCCTGT GTCCCTGGGG GCCCCTGATG 4320 

CTTTCTCCTT GGGTGATAGT AACATGCAGA GCACTTACAC AAAGCTCCCT CTTTGGACAT 4380 

ACOOCACGTC GACCTGTCAC AGGCCTGGCT GTAGCGAGCA CCTCCCTATG ACGCAGAATG 4440 

CTTCTTGGGA ATTATCTTAC TCCTCTGGAG GGTTAGTCCA TCAATGTTTT GCTTCTTGTC 4500 

CCAATACTAC TGTGACCCTC TCTGATCGCA CAGAAATCAC TGCCTATCAC ATATATCCTG 4560 

TTAAGCACTG AAGACCCTAT TGAAATTAGA GTTCTACAGA TGCCAAAAGC TGTACTTTCC 4620 

ATCAGGCAGA TGGCAAGCTT ACTGCCTTGA TGCACATCTG GAGCCACTGG AGCTCCTTCC 4680 

TCTCTGGTTC CAGCATTAAG GTGGAGAACT CCATGTAGCT TCTTGTCCTT TCCCCTCAGC 4740 

TGTCTTTGCT TCACAAGGTT TTAGCCCAAA GCAAGAGTGC AATCCCAAAG CCACAGAGAA 4800 

ATGAACTTTC CGCTACCTGG AAGCTTTAAG TGAGTAAATC AGCTTTTCCC CTCTCATTCC 4860 

TAGAGGCACA CACCTCAAAA GTTACTAGGC TGGAGAGACC CTACCTTCCA GTGACCCACT 4920 

CATCCCCCAG CCACGGAGAA GAGGGAAGAC CAAAAAGGGA GAGTGAGAAA GAGGATGAGA 4980 

GGGATGGTCA GCTGTGAGGG GAGGGGGCAA GTGGCCCAGC AAATGTTGAT GCCTCCCTTC 5040 

CCATCTTGCC ACACGGTCTT TTTCTTTTGT AGCACAGCCT CCATTAATAA CTCCTCGGCT 5100 

GAGGATGAAG ATGTAGGCAC CTTTACCCCC AGAGCCAGTT CCTTAATTGG CTGGCTTTCT 5160 

GAGATGCAGA CCACCCTAGA ATCTCATCTA GGTTCACTAG AAGTTAGTTA AATCTTCCTT 5220 

TCTCTGTCTT TCTCTTCATT CCATCCCCCA AACCCACCAA ACACTAAGGG AGAGCTCCCT 5280 

TTGGATGTCT GGGCAGTAAA CCTAGCTCAT TTTTCTAGGA GACCCAGAAG TGACTTCTGA 5340 

GTAGTTATCA CTGTGTCTGC CTCTGTTACA CTGTGCTGCT TTGCTTAAAC AGAAATGCAG 5400 

GCCTGGACAT CTGACTGTGC CTTTATATTC TGAGTGGGGT GCTGCCCCAT GCAAAAAAAT 5460 

CCAGAGAGGT AGTGAGGTGT CAGAGCTAAA CACTTGGTGC TGGGTTTTGT TGATGCTGGT 5520 

ATAATGTGAC ACAGTACAAT TACATGCTAA ATTTTGCATT TTCTCTATAT AACATCTATT 5580 

TTTCCTGATA CTGTGCCTTT GCCATTTTGA TAATGCTATT TTGATTGAGT GAATTTTATT 5640 

TCCTTTGTAT TCCCATAGTG AACAATATAT TAAGGTAGAT GCCCTTTATC TGGGTACTCC 5700 

TGGTAGATTA GCTGTTACAC CTCCCTTCCC TTTTTTACAG TGAACCTGTA TTCAGTTATT 5760 

GTCACTCTGA GAACTCTCCA ATAACAATTT CTTTTCCACA GTTAACAACA CAGCTGTTAC 5820 

ACCTCCCTTC CTTTTTTTCA CAGTGAACCT GTATTCAGCT ATTCTCACTC TGAGAACTCT 5880 

CCAATAACAA TTTCTTTTCC ACAGTTAACA ACAAAGTTCT GTTTTTAAAT GAAGAGATTA 5940 

AGTTCTTTTT AAATGCCTAA AGGCATATTC TGACAACTTT TCTACTTCTT TAACTTTTTT 6000 
GATTTAAGAT ATATGCAAAG CAAATAAATT CAATAAAGCC T 

SEQ ID WQiifl ppG5 Protein sequence 
Protein Accession*: 6AA86524 

1 11 21 31 41 51 

I I I I I I 

EQPTTSQPET TTPQGLLSDK DDMGRRNAGI DFGSRKASAA QPIPENMDNS KVSDFQPYHE 60 
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DAASGAEKTE ARASLSLMVE SLSTTQEEAI LSVAAEAQVF MNPSHTQLED QEAFSFDLGK 120 

AQSKMESAQD VQTICKEKPS GNVHQTFTAS VLGMTSTTAK GDVYAKTLPP RSLFQSSRKP 180 

DAEEVSSDSE NIPEBGDGSE EIAHGHSSQS LGKFEDEQEV PSESKSFVED LS5SEEELDL 240 

RCLSQALEEP EDAEVFTESS SYVEKYNTSD DCSSSEEDLP LRHPAQALGK PKNQQEVSSA 300 

SNNTPEEQND FMQQLPSRCP SQPIMNPTVQ QQVPTSSVGT SIKQSDSVEP IPPRHPFQPW 360 

VNPKVEQEVS SSPKSMAVEE SISMKPLPPK LLCQPLMNPK VQQNMFSGSE DIAVERVISV 420 

EPLLPRYSPQ SLTDPQIRQI SESTAVEEGT YVEPLPPRCL SQPSERPKFL DSMSTSAEWS 480 

SPVAPTPSKY TSPPWVTPKF EELYQLSAHP ESTTVEEDIS KEQLLPRHLS QLTVGNKVQQ 540 

LSSNFERAAI EADISGSPLP PQYATQFLKR SKVQEMTSRL EKMAVEGTSN KSPIPRRPTQ 600 

SFVKFMAQQI FSESSALKRG SDVAPLPPNL PSKSLSKPEV KHQVFSDSGS ANPKGGISSK 660 

MLPMKHPLQS LGRPEDPQKV FSYSERAPGK CSSFKEQLSP RQLSQALRKP EYEQKVSPVS 720 

ASSPKEWRNS KKQLPPKHSS QASDRSKFQP QMSSKGPVNV PVKQSSGEKH LPSSSPFQQQ 780 

VHSSSVNAAA RRSVFESNSD NWFLGRDEAF AIKTKKFSQG SKNPIKSIPA PATKPGKFTI 840 

APVRQTSTSG GIYSKKEDLE SGDGNNNQHA NLSNQDDVEK LFGVRLKRAP PSQKYKSEKQ 900 

DNFTQLASVP SGPISSSVGR GHKXRSTSQG LLDAAGNLTK ISYVADKQQS RPKSESMAKK 960 

QPACKTPGKP AGQQSDYAVS EPVWITMAKQ KQKSFKAHIS VKELKTKSNA GADAETKEPK 1020 

YEGAGSANEN QPKKMFTSSV HKQEKTAQMK PPKPTKSVGF EAQKILOVPA MEKETKRSST 1080 
LPAKFQNFVE PIEPVWFSLA RKKAKAWSHM AEITQ 

SEQ tD NO:49 PAB7 DNA SEQUENCE 

Nucleic Add Accession!: D87742 

Coding sequence: 203-3582 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GCTTTCCTTT CTAAAGTAGA AGAGGATGAT TATCCCTCTG AAGAACTACT AGAGGATGAA 60 

AACGCTATAA ATGCAAAACG GTCTAAAGAA AAAAACCCTG GGAATCAGGG CAGGCAGTTT 120 

GATGTTAATC TGCAAGTCCC TGACAGAGCA GTTTTAGGGA CCATTCATCC AGATCCAGAA 180 

ATTGAAGAAA GCAAGCAAGA AACTAGTATG ATTTTGGATA GTGAAAAAAC AAGTGAGACT 240 

GCTGCCAAAG GGGTCAACAC AGGAGGCAGG GAACCAAATA CAATGGTGGA AAAAGAACGC 300 

CCTCTGGCAG ATAAGAAAGC ACAGAGACCA TTTGAACGAA GTGACTTTTC TGACAGCATA 360 

AAAATTCAGA CTCCAGAATT AGGTGAAGTG TTTCAGAATA AAGATTCTGA TTATCTGAAG 420 

AACGACAACC CTGAGGAACA TCTGAAGACC TCAGGGCTTG CAGGGGAGCC TGAGGGAGAA 480 

CTCTCAAAAG AGGACCATGG GAACACAGAG AAGTACATGG GCACAGAAAG CCAGGGGTCT 540 

GCTGCTGCAG AACCTGAAGA TGACTCGTTC CACTCGACTC CACATACAAG TGTAGAGCCA 600 

GGGCATAGTG ACAAGAGGGA GGACTTACTT ATCATAAGCA GCTTCTTTAA AGAACAACAG 660 

TCTTTGCAGC GGTTCCAGAA GTACTTTAAT GTCCATGAGC TGGAAGCCTT GCTACAAGAA 720 

ATGTCATCAA AACTGAAGTC AGCGCAGCAG GAGAGCCTGC CCTATAATAT GGAAAAAGTC 780 

CTAGATAAGG TCTTCCGTGC TTCTGAGTCA CAAATTCTGA GCATAGCAGA AAAAATGCTT 840 

GATACTCGTG TGGCTGAAAA TAGAGATCTG GGAATGAACG AAAATAACAT ATTTGAAGAG 900 

GCTGCAGTGC TTGATGACAT TCAAGACCTC ATCTATTTTG TCAGGTACAA GCACTCCACA 960 

GCAGAGGAGA CAGCCACACT GGTGATGGCA CCACCTCTAG AGGAAGGCTT GGGTGGAGCA 1020 

ATGGAAGAGA TGCAACCACT GCATGAAGAT AATTTCTCAC GAGAGAAGAC AGCAGAACTT 1080 

AATGTGCAGG TTCCTGAAGA ACCCACCCAC TTGGACCAAC GTGTGATTGG GGACACTCAT 1140 

GCCTCAGAAG TGTCACAGAA GCCAAATACT GAGAAAGACC TGGACCCAGG GCCAGTTACA 1200 

ACAGAAGACA CTCCTATGGA TGCTATTGAT GCAAACAAGC AACCAGAGAC AGCCGCCGAA 1260 

GAGCCGGCAA GTGTCACACC TTTGGAAAAC GCAATCCTTC TAATATATTC ATTCATGTTT 1320 

TATTTAACTA AGTCGCTAGT TGCTACATTG CCTGATGATG TTCAGCCTGG GCCTGATTTT 1380 

TATGGACTGC CATGGAAACC TGTATTTATC ACTGCCTTCT TGGGAATTGC TTCGTTTGCC 1440 

ATTTTCTTAT GGAGAACTGT CCTTGTTGTG AAGGATAGAG TATATCAAGT CACGGAACAG 1500 

CAAATTTCTG AGAAGTTGAA GACTATCATG AAAGAAAATA CAGAACTTGT ACAAAAATTG 1560 

TCAAATTATG AACAGAAGAT CAAGGAATCA AAGAAACATG TTCAGGAAAC CAGGAAACAA 1620 

AATATGATTC TCTCTGATGA AGCAATTAAA TATAAGGATA AAATCAAGAC ACTTGAAAAA 1680 

AATCAGGAAA TTCTGGATGA CACAGCTAAA AATCTTCGTG TTATGCTAGA ATCTGAGAGA 1740 

GAACAGAATG TCAAGAATCA GGACTTGATA TCAGAAAACA AGAAATCTAT AGAGAAGTTA 1800 

AAGGATGTTA TTTCAATGAA TGCCTCAGAA TTTTCAGAGG TTCAGATTGC ACTTAATGAA 1860 

GCTAAGCTTA GTGAAGAGAA GGTGAAGTCT GAATGCCATC GGGTTCAAGA AGAAAATGCT 1920 

AGGCTTAAGA AGAAAAAAGA GCAGTTGCAG CAGGAAATCG AAGACTGGAG TAAATTACAT 1980 

GCTGAGCTCA GTGAGCAAAT CAAATCATTT GAGAAGTCTC AGAAAGATTT GGAAGTAGCT 2040 

CTTACTCACA AGGATGATAA TATTAATGCT TTGACTAACT GCATTACACA GTTGAATCTG 2100 

TTAGAGTGTG AATCTGAATC TGAGGGTCAA AATAAAGGTG GAAATGATTC AGATGAATTA 2160 

GCAAATGGAG AAGTGGGAGG TGACCGGAAT GAGAAGATGA AAAATCAAAT TAAGCAGATG 2220 

ATGGATGTCT CTCGGACACA GACTGCAATA TCGGTAGTTG AAGAGGATCT AAAGCTTTTA 2280 

CAGCTTAAGC TAAGAGCCTC CGTGTCCACT AAATGTAACC TGGAAGACCA GGTAAAGAAA 2340 

TTGGAAGATG ACCGCAACTC ACTACAAGCT GCCAAAGCTG GACTGGAAGA TGAATGCAAA 2400 

ACCTTGAGGC AGAAAGTGGA GATTCTGAAT GAGCTCTATC AGCAGAAGGA GATGGCTTTG 2460 

CAAAAGAAAC TGAGTCAAGA AGAGTATGAA CGGCAAGAAA GAGAGCACAG GCTGTCAGCT 2520 

GCAGATGAAA AGGCAGTTTC GGCTGCAGAG GAAGTAAAAA CTTACAAGCG GAGAATTGAA 2580 

GAAATGGAGG ATGAATTACA GAAGACAGAG CGGTCATTTA AAAACCAGAT CGCTACCCAT 2640 

GAGAAGAAAG CTCATGAAAA CTGGCTCAAA GCTCGTGCTG CAGAAAGAGC TATAGCTGAA 2700 

GAGAAAAGGG AAGCTGCCAA TTTGAGACAC AAATTATTAG AATTAACACA AAAGATGGCA 2760 

ATGCTGCAAG AAGAACCTGT GATTGTAAAA CCAATGCCAG GAAAACCAAA TACACAAAAC 2820 

CCTCCACGGA GAGGTCCTCT GAGCCAGAAT GGCTCTTTTG GCCCATCCCC TCTGAGTGGT 2880 

GGAGAATGCT CCCCTCCATT GACAGTGGAG CCACCCGTGA GACCTCTCTC TGCTACTCTC 2940 

AATCGAAGAG ATATGCCTAG AAGTGAATTT GGATCAGTGG ACGGGCCTCT ACCTCATCCT 3000 

CGATGGTCAG CTGAGGCATC TGGGAAACCC TCTCCTTCTG ATCCAGGATC TGGTACAGCT 3060 
ACCATGATGA ACAGCAGCTC AAGAGGCTCT TCCCCTACCA GGGTACTCGA TGAAGGCAAG . 3120 

GTTAATATGG CTCCAAAAGG GCCCCCTCCT TTCCCAGGAG TCCCTCTCAT GAGCACCCCC 3180 

ATGGGAGGCC CTGTACCACC ACCCATTCGA TATGGACCAC CACCTCAGCT CTGCGGACCT 3240 

TTTGGGCCTC GGCCACTTCC TCCACCCTTT GGCCCTGGTA TGCGTCCACC ACTAGGCTTA 3300 
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AGAGAATTTG CACCAGGCGT TCCACCAGGA AGACGGGACC TGCCTCTCCA CCCTCGG66A 3360 

TTTTTACCTG GACACGCACC ATTTAGACCT TTAGGTTCAC TTGGCCCAAG AGAGTACTTT 3420 

ATTCCTGGTA CCCGATTACC ACCCCCAACC CATGGTCCCC AGGAATACCC ACCACCACCT 3480 

GCTGTAAGAG ACTTACTGCC GTCAGGCTCT AGAGATGAGC CTCCACCTGC CTCTCAGAGC 3540 

ACTAGCCAGG ACTGTTCACA GGCTTTAAAA CAGAGCCC AT AAA ACTATGA CCTCTGAGGT 3600 

TTCATTGGAA AGAAAGTGTA CTGTGCATTA TCCATTACAG TAAAGGATTT CATTGGCTTC 3660 

AAAATCCAAA AGTTTATTTT AAAAGGTTTG TTGTTAGAAC TAAGCTGCCT TGGCAGTGTG 3720 

CATTTTTGAG CCAAACAATT CAAAAATGTC ATTTCTTCCC TAAATAAAAA TCACCTTTTA 3780 

AGCTAGAGCG TCCTTACAAC TTTGAAATGT GCAATAAAGA ATACCTGTGT TTTAGCTAAT 3840 

GTAGCATATG TAATTGCAAA ATGATTTAGA ATGTCATGAA AAATATGAAC ATTTCCTGTG 3900 

GAAATGCTTT AAGAACATGT ATTTCCATTA TCCTATTTTT AGTGTACACC AGCTGAATAC 3960 

GGAGCAATGG TGTTTATAAG CGTTTTTTTA AACTATCTGG TCACAAAGAC TGTTACGCTA 4020 

AAAATGTTTA CTAAAAGATC ACTAAACTAT CTCCCCTCTT GCTGAAGTTC TTTGTAGTAA 4080 

TAGCTCATAA AAATTTGTTT ATTAATATTT CCCAAGTGTC TGTTGACTCA TTGGACTGTT 4140 

ATGAGGCTTG TGCCATTTGG GGAACATGTA AACTCAGGCT CCCAGAACTG AAGATGGTGG 4200 

CTGGTGGCAC ACTTCCGGCT GCTCCTCCGT CACCTGTGAA CTCTACAAGT GATGTCTTTT 4260 

TATTTCAAAG AAGTTTATTT CCCACTTGTA TAGCATTCAC ATGCTTTCTT TACGATCCTC 4320 

ATTGTCTATT TGAGAATGGT TTTCTGAGAG TGAGTTTACA TTAGTAGCAA GAGTTGTTTG 4380 

ACCTGATGTT CCATTGTTTT TACCATTCCT GTAGAAAAAG GGTGCACAAC AGAAAAATGA 4440 

AAATGATGTG TCATGGCCAT AAAAGTATAG AAATCTTTAA AAATTTTAAA ATGTACAGTC 4500 

CCTTATCTAT CTTTCCCATT CCTTGCCACT GATTTTTGAG GAATATAATA AAAAGATTGG 4560 

AAGAGTATAA TGCCATGAGA AAGAATGATT TAGGACTGTG AGGGTTATAA CATGCCCTAG 4620 

GTCAGCAACC AAGGGTTGAA ATCAGTTCTG TTTTAGGGGG AAATGGGGGG GGCGACAGAT 4680 

ATTATTOCAA AATTAATATT AATTAATATT TAAACGTTGG TGTTTTTATT TAAAAATCAG 4740 

TAACTAACCA TCTGGAATTG CACCATACTT AAAGTCTTAT CCATTACTAC ACTGTCTTTA 4800 

AAACAATGTT TCTTTAAATA CTCTACAACG TTTCTAAGAA CGAACTTCAG ACATTTTAAT 4860 

TACAGTAATA ATAGCACTCC TTTTAAGGAG TTTCAGATCC ACACTAAAAC TAAAATCATA 4920 

AAAGGCTGAT ACTTTTGTTT GCTGCTAGGC TATATTCTTC CATTCTTTGA AGTCCTATGA 4980 

TGTAATATTT TTGAAACCTA GTGTATGTCT TGTCACTGTT GTGATATTTA ATCGATTAAG 5040 

AATACCTTGT AAAAAGGAGC AAAAGCTTCA ATGTGAAACA ATTTTCTCTC TTTATACTAA 5100 

ACAACTGAAG ATAGATAGTT TAGAAAGATA AGGACCTTTG AAAGAAGACA ACTCTGTCAA 5160 

AGTTCATAAG GAATATAAAA ATTCTTCAGG AAAAGAGAAT TCAATCTATA TGTCCTCCCG 5220 

TTTAATATCA AGAATAGAAG AAATTAAGAG GAAAACTCCA CAGAAGAGCA TAGGCCACTT 5280 

TTAGCCATGT AAAAATAAGA TTAAGTCACA AATACAACTT TTGAATTTAC CTGTCAATAT 5340 

CTCTTTAGGA CACAAAACAA TGCTGAAGTT AATATAATTT CTAATTTTAA ATGTCATTTA 5400 

AGTGTAGATT ATGCCATCTA GGAAGGTAAG TAGGAAAGGT AAATTAAATC TATTTTTAAA 5460 

ATTCAAAATA TTAGAGTATT TTTCCCCTCT AAAGCCTTTT TTGGTGATTA TTCTGTATCT 5520 

GACATAATTG AGAAACTGGT AAGCTGTAAA GATTCCAGTG TAGCTTCTCT GAGAAGTTGT 5580 

GAGCCAGTCC ATAACTGCTT CCTCACATCC ATCTGATTGC ACCATTTCTG CAGCAAACCC 5640 

CAAAGCAGGG TGCCAATATG CAGATGGCAT AGGGAGTATC ATCCCTCAGC CAAATCACTT 5700 

TTCCATCTCT AAAGTTTCAT CTATTTTGGA AGTCATCTCC AACTAATTGT GTCTGGATTT 5760 

AGTTGCTAAA ATTGTCTTAT TTATTTATGA AGCAGCAATA TTCAGCCTGA AAGCATTTCT 5820 

GCCATAGTTG TTGTAGTTAT ATCGCCAATG GCTGATTTTT TTCATTGGAA AGTAAATTTA 5880 

AGTAATTCGT GGGATGTGGT ATATTCTGTG TCAACTTCAA GATAATCACT CATTTTCTCG 5940 
TTATATTCAG GTCTGAATTA AAGTTAAGTT AATCAC 



$EQ fP W3P mi Pfft^n SWUCTCT 
Protein Accession*: 6AA13446 

1 11 21 31 41 51 

I I I I I I 

AFLSKVEEDD YPSEELLEDE NAINAKRSKE KNPGNQGRQF DVNLQVFDRA VLGTIHPDPE 60 

IEESKQETSM ILDSEKTSET AAKGVNTGGR EPNTMVEKER PLADKKAQRP FERSDFSDSI 120 

KTQTPELGEV FQNKDSDYLK NDNPEEHLKT SGLAGEPEGB LSKEDHGNTE KYHGTESQGS 180 

AAAEPEDDSF HWTPHTSVEP GHSDKREDLL IISSFFKEQQ SLQRFQKYFN VHELEALLQE 240 

MSSKLKSAQQ ESLPYNMEKV LDKVFRASES QILSIAEKML DTRVAENRDL GMNENNIFEE 300 

AAVLDDIQDL IYFVRYKHST AEETATLVHA PPLEEGLGGA MEEMQPLHED KFSREKTAEL 360 

NVQVPEEPTH LDQRVIGDTH ASEVSQKPNT EKDLDPGFVT TEDTPMDAID ANKQPETAAE 420 

EPASVTPLEN AILLIYSFMF YLTKSLVATL PDDVQPGFDF YGLPWKPVFI TAFLGIASFA 480 

IFLWRTVLW KDRVYQVTEQ QISEKLKTIM KENTELVQKL SNYEQKIKES KKHVQETRKQ 540 

NMILSDEAIK YKDKIKTLEK NQEILDDTAK NLRVMLESER EQNVKNQDLI SENKKSIEKL 600 

KDVISMNASE FSEVQZALNE AKLSEEKVKS ECHRVQEENA RLKKKKEQLQ QEIEDWSKLK 660 

AELSEQIKSF EKSQKDLEVA LTHKDDNIMA LTNCITQLNL LECESESEGQ NKGGNDSDEL 720 

ANGEVGGDRN EKMKNQIKQM MDVSRTQTAI SWEEDLKLL QLKLRASVST KCNLEDQVKK 780 

LEDDRNSLQA AKAGLEDECK TLRQKVEILN ELYQQKEHAL QKKLSQEEYE RQEREHRLSA 840 

ADEKAVSAAE EVKTYKRRIE EMEDELQKTE RSFKNQIATH EKKAHENWLK ARAAERAIAE 900 

EKREAANLRH KLLELTQKKA MLQEEPVTVK PMPGKPNTQN PPRRGPLSQN GSFGPSPVSG 960 

GECSPPLTVE PPVRPLSATL NRRDMPRSEF GSVDGPLPHP RWSAEASGKP SPSDPGSGTA 1020 

TMMNSSSRGS SPTRVLDEGK VNMAPKGPPP FPGVPIWSTP MGGPVPPPIR YGPPPQLCGP 1080 

FGPRPLPPPF GPGMRPPLGL REFAPGVPPG RRDLPLHPRG FLPGHAPFRP LGSLGPREYF 1140 
IPGTRLPPPT HGPQEYPPPP AVRDLLPSGS RDEPPPASQS TSQDCSQALK QSP 

SEQ 10 NO^i PAB9 DMA SEQUENCE 

Nucleic Add Accession*: NM_006457 

Cooing sequence: 64-1874 (undeKined sequences correspond to start and stop codons] 

1 11 21 31 41 51 

I I I I I I 

AGACTGAGGC GGAGGCAGCC CCGCGCCGCG CCGGACCCGA GCATATTTCA TTTTCTGTCA 60 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TTGGACTTTG 
CTTGGGGTTT 
TAAAAGATGG 
TTGATGGAAT 
GTACAGGCTC 
TTCCTGTTCA 
CTGTGTCCAA 
GTTCTGTGTC 
CCCATGCGAC 
TGTTCGCTGC 
CACTGAGCGC 
GTTCCGAGAC 
AACAGCAAAA 
TACCCACTCA 
CAAGAACTGG 
AACATTTGAA 
CTCCGCAGTT 
CAACCTCTGG 
GATCCACTGG 
CTGGAAGAAT 
TGGGACAAAC 
CAGGGAAACG 
TGGCACTGGG 
TGGCCTACAT 
AATTCTTTGC 
CGTTGAAACA 
GGAACAATGT 
TCTTTGGTAC 
AAGCTCTGGG 
TGGAAGGTCA 
CTGTGAATTT 
AAATTAAAAT 
AGTGGCCCTG 
CATAAAGTAA 
AAATAAGCTT 
AGTGAAGAAT 
TGTTAGGTAG 
AACAGAATTA 
TTAAACAGAG 
GCGCGGTGGC 
GAGGTCAGGA 
ACAAAAATTA 
GCACGAGAAT 
CACTCCAGCC 
TATTTTTGCC 
TGTGTCATGC 
GCTACCATAT 
TTCATTGTGT 
GTAAAGATTT 
AATAGAGGGC 
GGGCGGATCA 
CTACTAAAAA 
GGGAGGCTGA 
CACACCACTG 



AGCCATTAGA 
CCGGCTGCAG 
CGGCAAGGCA 
AAATGCACAA 
TTTGAATATG 
AAAGGGAGAA 
AGTCACTTCC 
TTCACCAAAA 
CACCTCATCA 
ATCTGGACTG 
TGGTAAAACT 
TTCTCAGGAG 
TGGCCCACCA 
CAGTGATGCC 
AACAACTCAG 
AGAATCTGAA 
GGCTTCCTTG 
CAGACCAGGG 
CGTCATCAAG 
CTCAAACAGC 
CCAGCCAAGT 
AACTCCGATG 
GAAATCTTGG 
TGGATTTGTA 
CCCTGAATGT 
AACTTGGCAT 
TTTTCACTTG 
TATATGCCAT 
CTACACCTGG 
GACCTTTTTC 
TTGAAAGTCA 
TACTAATTAA 
AAGGAATAAA 
AGAGACGGTT 
TATAAAAACC 
TTAATTTTAG 
TTATGAGTAA 
TTGTATTTAA 
AATTTTATCA 
TCACGCCTGT 
GTTTGAGATC 
GCCGGACGCA 
CACTTGAACC 
TGGGTGACAG 
TTACAGTGGA 
CAGTAAGAGA 
AGCTTATAAG 
ATGCTTCATC 
AATTAAATAA 
CAGGTGTGGT 
TGAGGTCAAG 
TACAAAAATG 
GGCAGGAAAA 
CACTCCAGCC 



ACCATGAGCA 
GGCGGTAAGG 
GCCCAGGCAA 
GGAATGACTC 
ACTCTGCAAA 
CCTAAAGAAG 
ACAAACAACA 
GTCACATCCA 
CATGCTTCCC 
CATGCTAATG 
GCAGTTAATG 
CTAGCAGAGG 
AGAAAACACA 
AGCAAGAAGA 
TCTCGCTCTT 
GCCGATAATA 
GTAGCTTCCA 
GTTACCAGCC 
TCACCAAGCT 
GCTACTTACT 
GACCAGGACA 
TGCGCCCATT 
CACCCAGAAG 
GAGGAGAAAG 
GGTCGATGCC 
GTTTCCTGTT 
GAGGATGGTG 
GGATGTGAAT 
CATGACACTT 
TCCAAGAAGG 
ACAGTTCAGG 
TTTTTAGATT 
TTCCAGCTTT 
TGGCATTTAT 
AATTTCCTGA 
AATAAATAAT 
ATCTGCAAAA 
AAAAAAACTA 
GTAATAGGTG 
AATCCCAGCA 
AGCCTGGCCA 
GTGGCACGCG 
CGGGAGGGAG 
AGTGAGACTC 
TCATTCTAGT 
TGTTATATTC 
TCTCAAATTT 
ACCTATATTA 
TTTTGGCCTC 
GGCTCACGCC 
AGATCAAGAT 
AGCTGGGCAT 
TTCTTGAACC 
TGGTGACAGA 



ACTACAGTGT 
ATTTCAACAT 
ATGTAAGAAT 
ATCTTGAAGC 
GAGCATCTGC 
TAGTTAAACC 
TGGCCTACAA 
TCCCATCACC 
CTTCACCCGT 
CCAATCTTAG 
TCCCACGGCA 
GACAGAGAAG 
TTGTGGAGCG 
GACTGATTGA 
TCCGAATCCT 
CAAAGAAGGC 
CACGGAGCAT 
TCACAACTGC 
GGCAACGGCC 
CAGGATCAGT 
CTTTAGTGCA 
GTAACCAGGT 
AATTCAACTG 
GAGCCCTGTA 
AAAGGAAGAT 
TTGTGTGTGT 
AACCCTACTG 
TTCCCATAGA 
GCTTTGTATG 
ACAAGCCCCT 
AGAAGAGAAG 
CAATATTTAT 
AAAAACCAAG 
TATTACTTTT 
TGGACTATTA 
CCAATCTGAA 
GGCAATGAAA 
ATACTTATCT 
TCAGTTTTTA 
CTTTGGGAGG 
ACATGGTGAA 
CCTGTAATCC 
AGGTTGCAGT 
CGTCTCCAAA 
AGGAAAGGAC 
TTTTCTTATT 
TTGCCTTTTA 
GGCAAATTCC 
TCATAGTTTT 
TGTGATCCCA 
CATCCTGGCC 
GGTGGGGCGT 
CAGGAGACGG 
GCAAGACTCC 



GTCACTGGTT 
GCCTCTGACA 
AGGCGATGTG 
CCAGAATAAG 
TGCACCCAAG 
TGTGCCCATT 
TAAGGCACCA 
ATCGTCTGCC 
GGCTGCCGTC 
TGCTGACCAG 
GCCCACAGTC 
AGGATCCCAG 
CTATACAGAG 
GGATACTGAA 
TGCCCAGATC 
AAATAACTCT 
GCCCGAGAGC 
AGCTGCCTTC 
AAACCAAGGA 
GGCACCAGCC 
AAGAGCTGAG 
CATCAGAGGA 
CGCTCACTGC 
TTGTGAGCTG 
CCTTGGAGAA 
AGCCTGTGGA 
TGAGACTGAT 
AGCTGGTGAC 
CTCAGTGTGT 
GTGTAAGAAA 
GAATTTGAAG 
ATGGAGTTTT 
TCTGAGGAAA 
TCCTGTATTT 
AATTCATCTT 
ATAATTATAC 
ATGCCTTAAA 
TTAAAATAGT 
AAAAATTGCT 
CCAAGGTGGG 
ACCCCATCTC 
CAGCTACTCA 
GAGCCAAGAT 
AAAAAACTTT 
AATAAGATTT 
TCTTCCCCAC 
CTAAAATGTG 
ATTTTTTCCC 
CTCTCTCTTT 
GCACTTTGGG 
AACATGGTGA 
GCCTGTAGTC 
AAGTTGCAGT 
GGCTCTT 



GGCCCAGCTC 
ATCTCTAGTC 
GTTCTCAGCA 
ATTAAGGGTT 
CCTGAGCCGG 
ACATCTCCTG 
CGGCCTTTTG 
TTCACCCCAG 
ACTCCTCCCC 
TCTCCATCTG 
ACCAGCGTGT 
GGTGACAGTA 
TTTTATCATG 
GACTGGCGTC 
ACTGGGACTG 
CAGGAGCCTT 
CTGGACAGCC 
AAGCCTGTAG 
GTACCTTCCA 
AACTCAGCTT 
CACATTCCAG 
CCATTCTTAG 
AAAAATACAA 
TGCTATGAGA 
GTCATCAATG 
AAGCCCATTC 
TATTATGCCC 
ATGTTCCTGG 
TGTGAAAGTT 
CATGCTCATT 
AGAAAAAGGA 
GAAAAATAAT 
TATTTGGCTT 
TATGCCCATA 
AGAATAAATT 
CTTCTTTCCT 
TTTTATCAAT 
AAATAGGATT 
TGTAGGCTGA 
TGGACCACAT 
TACTAAAAAT 
AGAGGCTGAG 
CGTACCACTG 
GCTTGTATAT 
TTTATCAAAA 
CCAAAAATAA 
ATTGTTTCTG 
TTGCGCTAAG 
AAAGAGAATA 
AGGCCAAGAC 
AACCCTGTCT 
CCATGTACTT 
GAGCTGAGAT 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



MQ NQ:52 PAS? proton sequence. 
Protein Accession* NPJJ06448 



1 
I 

1 MSNYSVSLVG 
61 MTHLEAQNKX 
121 NNMAYNKAPR 
181 ANANLSADQS 
241 KHIVERYTEF 
301 DNTKKANNSQ 
361 PSWQRPNQGV 
421 AHCNQVTRGP 
481 RCQRKILGEV 
541 CEFPIEAGDM 



11 
I 

PAPWGFRLQG 
KGCTGSLNMT 
PFGSVSSPKV 
PSALSAGKTA 
YHVPTHSDAS 
EPSPQLASLV 
PSTGRISNSA 
FLVALGKSWH 
INALKQTWHV 
FLEALGYTWH 



21 
I 

GKDFNMPLTI 
LQRASAAPKP 
TSIPSPSSAF 
VNVPRQPTVT 
KKRIiIEDTED 
ASTRSMPESL 
TYSGSVAPAN 
PEEFNCAHCK 
SCFVCVACGK 
DTCFVCSVCC 



31 
I 

SSLKDGGKAA 
EPVPVQKGEP 
TPAHATTSSH 
SVCSETSQEL 
WRPRTGTTQS 
DSPTSGRPGV 
SALGQTQPSD 
NTMAYIGFVE 
PIRNNVFHLE 
ESLEGQTFFS 



41 

i 

QANVRIGDW 
KEWKPVPIT 
ASPSPVAAVT 
AEGQRRGSQG 
RSFRILAQIT 
TSLTTAAAFK 
QDTLVQRAEH 
EKGALYCELC 
DGEPYCETDY 
KKDKPLCKKH 



51 
I 

LSIDGINAQG 
SPAVSKVTST 
PPLFAASGLH 
DSKQQNGPPR 
GTEHLKESEA 
PVGSTGVIKS 
IPAGKRTPMC 
YEKFFAPECG 
YALFGTICHG 
AHSVNF 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEQ ID N0:53 PBH7 DNA SEQUENCE 

Nucleic Add Accession #: AA431407 

Coding sequence: 1-864 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I i I I I 

ATG GCCAACT GTAAAATGAC CAAAAGCATC AGGTTCCCTG CCCTGGAGCA CTGCTATACT 
GGCGGGGAGG TCGTGTTGCC CAAGGATCAG GAGGAGTGGA AAAGACGGAC GGGCCTTCTG 
CTCTACGAGA ACTATGGGCA GTCGGAAACG GGACTAATTT GTGCCACCTA CTGGGGAATG 
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AAGATCAAGC CGGGTTTCAT GGGGAAGGCC ACTCCACCCT ATGACGTCCA GTTTCATATG 240 

GAGGCCTCAG TTGAAAACTG CATTATTGTG AGCATGAACA CCGCTGACCC TGGCAGCCAG 300 

GGCATCACAC ACAGCCTCTT GCTACAGGTC ATTGATGACA AGGGCAGCAT CCTGCCACCT 360 

AACACAGAAG GAAACATTGG CATCAGAATC AAACCTGTCA GGCCTGTGAG CCTCTTCATG 420 

TGCTATGAGG GTGACCCAGA GAAGACAGCT AAAGTGGAAT GTGGGGACTT CTACAACACT 480 

GGGGACAGAG GAAAGATGGA TGAAGAGGGC TACATTTGTT TCCTGGGGAG GAGTGATGAC 540 

ATCATTAATG CCTCTGGGTA TCGCATCGGG CCTGCAGAGG TTGAAAGCGC TTTGGTGGAG 600 

CACCCAGCGG TGGCGGAGTC AGCCGTGGTG GGCAGCCCAG ACCCGATTCG AGGGGAGGTG 660 

GTGAAGGCCT TTATTGTCCT GACCCCACAG TTCCTGTCCC ATGACAAGGA TCAGCTGACC 720 

AAGGAACTGC AGCAGCATGT CAAGTCAGTG ACAGCCCCAT ACAAGTACCC AAGGAAGGTG 780 

GAGTTTGTCT CAGAGCTGCC AAAAACCATC ACTGGCAAGA TTGAACGGAA GGAACTTCGG 840 

AAAAAGGAGA CTGGTCAGAT GTAATCGGCA GTGAACTCAG AACGCACTGC ACACCTGAGG 900 

CAAATCCCTG GCCACTTTAG TCTCCCCACT ATGGTGAGGA CGAGGGTGGG GCATTGAGAG 960 

TGTTGATTTG GGAAAGTATC AGGAGTGCCA TGATTCCAAT GTTTTCCTTC TTTTAAATTA 1020 

AATTCAGTTG CTCTGCTTCC TCCAAGTCCT CTGTATCTTT AGAATTTCCC AGGTGAGCAC 1080 
TCATAACGCA AGTAATAAAA TACTGATATC AACAA 

gEQIDNp;$4pBH7 Pro t^geqU gnW 
Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

MANCKMTKSI RFPALEHCYT GGEWLPKDQ EEWKRRTGLL LYENYGCjSET GLICATYWGM 60 

KIKPGFHGKA TPPYDVQFHM EASVENCIIV SMNTADPGSQ GITHSLLLQV IDDKGSILPP 120 

NTEGNIGIRI KPVRPVSLFH CYEGDPEKTA KVECGDFYNT GDRGKMDEEG YICFLGRSDD 180 

IINASGYRIG PAEVBSALVE KPAVAESAW GSPDPIRGEV VKAFIVLTPQ FLSHDKDQLT 240 
KELQQHVKSV TAPYKYPRKV EFVSELPKTI TGKIERKELR KKETGQM 

SEQ ID NO:55 PBJ5 DMA SEQUENCE 

Nudtic Add Accession*: AF388200 

Coding sequence: 33-137 (underlined sequences correspond to start and stop codons) 

I 11 21 31 41 51 

II I I I I 

GAGAGAGGGA GGCAGAAGAG GAAGTCAGAG CGATGTGCTG TGAAATCTAC TACCGTTTGC 60 

TGGTTTTGAA AATGGAGAAA AAGAGTGAGG AACTGAGAAA CATGGATGGC CTTGGGAACG 120 

TGGAAAAGGG TCACTGAAAT GGGACGAC AT GAA CTCAAGG AGGCTATTTA TGACCATGTC 180 

ATTTGCAACA TGAAGAAAGC TTATCTGGAG TGAAAGTAAA TGAGACCAAC AGAGATAAGA 240 

GACCCGGAGA AATCCTGGTT ACACTGCTTG AATCCTGTCA GTCCTATACT GGAGTCCTGT 300 

TAATACAAAA TAATAGTAAT AATCCCTCTG TTTCTTATGT TTATGCCAAC TTCAACAAAA 360 

AGAAACTTGA CTAAGAGACA ATATAAGAAC TTAATGTGTA ATTAAGAAAG AACTCTCCAC 420 

CACGGGGAAT GTGAAAGGTA TATGAGTCCC TTTTCACGAT GCGATGTCAT GTCTTTTAAA 480 
TAAGCCATAC TTTATGTTCA ATAAAAAGAG AATAAGCAGG A 

SEQ 10 NO:S6 PBJ5 Protein sequence 
Protein Accession «: AAK83352 

1 11 21 31 41 51 

I I I I I I 

MCCEIYYRLL VLKMEKKSEE LRNMDGLGNV EKGH 

SEQ ID NO:57 PBJ7 DMA SEQUENCE 

Nudeic Acid Accession #: AA876910 

Coding sequence: 1-2064 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGACAGTT GCCTGCAACA TATGAGAGAC CTACTTTACC TCCTTCAGGA GCTCAGGTGT 60 

TTAAATCCAG CTACACTACT CCCTGATCCA GACTCCACTA CTCCTGTTCA TGACTGTCAG 120 

GATCTGTTGG AAACTACCAA AACTGGCCAA CCTGATCTTC AAGATGTGCC CCTAGAAAAG 180 

GCAGATGCCA CTGTGTTCAC AGATGGTAGC AGCTTCCTCG AGCAGGGAGA ACGAAAAGCT 240 

GTTTCTTTTC CACAGCCAGA TCTGCCTGAC AATCCCACAT ACTCAACAGA AGAAGAAAAA 300 

CTGGCTTCAG ATGTTGGAGC AAATAAAAAT CAGGAAGGAC GTGTATTCGC AAACACTACT 360 

TGGAGGGCCG GTACCTCCAA GGAAGTCTCC TTTGCAGTTG ATTTATGTGT ACTGTTCCCA 420 

GAGCCAGCTC GTACCCATGA AGAGCAACAT AATTTGCCGG TCATAGGAGC AGGAAGTGTC 480 

GACCTTGCAG CAGGATTTGG ACACTCTGGG AGCCAAACTG GATGTGGAAG CTCCAAAGGT 540 

GCAGAAAAAG GGCTCCAAAA TGTTGACTTT TACCTCTGTC CTGGAAATCA CCCTGACGCT . 600 

AGCTGTAGAG ATACTTACCA GTTTTTCTGC CCTGATTGGA CATGTGTAAC TTTAGCCACC 660 

TACTCTGGGG GATCAACTAG ATCTTCAACT CTTTCCATAA GTCGTGTTCC TCATCCTAAA 720 

TTATGTACTA GAAAAAATTG TAATCCTCTT ACTATAACTG TCCATGACCC TAATGCAGCT 780 

CAATGGTATT ATGGCATGTC ATGGGGATTA AGACTTTATA TCCCAGGATT TGATGTTGGG 840 

ACTATGTTCA CCATCCAAAA GAAAATCTTG GTCTCATGGA GCTCCCCCAA GCCAATCGGG 900 

CCTTTAACTG ATCTAGGTGA CCCTATATTC CAGAAACACC CTGACAAAGT TGATTTAACT 960 

GTTCCTCTGC CATTCTTAGT TCCTAGACCC CAGCTACAAC AACAACATCT TCAACCCAGC 1020 

CTAATGTCTA TACTAGGTGG AGTACACCAT CTCCTTAACC TCACCCAGCC TAAACTAGCC 1080 

CAAGATTGTT GGCTATGTTT AAAAGCAAAA CCCCCTTATT ATGTAGGATT AGGAGTAGAA 1140 

GCCACACTTA AACGTGGCCC TCTATCTTGT CATACACGAC CCCGTGCTCT CACAATAGGA 1200 

GATGTGTCTG GAAATGCTTC CTGTCTGATT AGTACCGGGT ATAACTTATC TGCTTCTCCT 1260 

TTTCAGGCTA CTTGTAATCA GTCCCTGCTT ACTTCCATAA GCACCTCAGT CTCTTACCAA 1320 

GCACCCAACA ATACCTGGTT GGCCTGCACC TCAGGTCTCA CTCGCTGCAT TAATGGAACT 1380 
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GAACCAGGAC CTCTCCTGTG CGTGTTAGTT CATGTACTTC CCCAGGTATA TGTGTACAGT 1440 

GGACCAGAAG GACGACAACT CATCGCTCCC CCTGAGTTAC ATCCCAGGTT GCACCAAGCT 1500 

GTCCCACTTC TGGTTCCCCT ATTGGCTGGT CTTAGCATAG CTGGATCAGC AGCCATTGGT 1560 

ACGGCTGCCC TGGTTCAAGG AGAAACTGGA CTAATATCCC TGTCTCAACA GGTGGA1GC? 1620 

GATTTTAGTA ACCTCCAGTC TGCCATAGAT ATACTACATT CCCAGGTAGA GTCTCTGGCT 1680 

GAAGTAGTTC TTCAAAACTG CCGATGCTTA GATCTGCTAT TCCTCTCTCA AGGAGGTTTA 1740 

TGTGCAGCTC TAGGAGAAAG TTGTTGCTTC TATGCCAATC AATCTGGAGT CATAAAAGGT 1800 

ACAGTAAAAA AAGTTCGAGA AAATCTAGAT AGGCACCAAC AAGAACGAGA AAATAACATC 1860 

CCCTGGTATC AAAGCATGTT TAACTGGAAC CCATGGCTAA CTACTTTAAT CACTGGGTTA 1920 

GCTGGACCTC TCCTCATCCT ACTATTAAGT TTAATTTTTG GGCCTTGTAT ATTAAATTCG 1980 

TTTCTTAATT TTATAAAACA ACGCATAGCT TCTGTCAAAC TTACGTATCT TAAGACTCAA 2040 
TATGACACCC TTGTTAATAA CTGA 

SEQ 03 NO:58 PBJ7 Protein semienca 
Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

MDSCLQHMRD LLYLLQELRC LNPATLLPDP DSTTPVHDCQ DLLETTKTGQ PDLQDVPLEK 60 

ADATVFTDGS SFLEQGERKA VSFPQPDLPD NPTYSTEEEK LASDVGANKN QEGRVFANTT 120 

WRAGTSKEVS FAVDLCVLFP EPARTHEEQH NLPVIGAGSV DLAAGFGHSG SQTGCGSSKG 180 

AEKGLQNVDF YLCPGNHPDA SCRDTYQFFC PDWTCVTLAT YSGGSTRSST LSISRVPHPK 240 

LCTRKNCNPL TITVHDPNAA QWYYGMSWGL RLYIPGFDVG TMFTIQKKIL VSWSSPKPIG 300 

PLTDLGDPIF QKHPDKVDLT VPLPFLVPRP QLQQQHLQPS LMSILGGVHH LLNLTQPKLA 360 

QDCWLCLKAK PPYYVGLGVE ATLKRGPLSC HTRPRALTIG DVSGNASCLI STGYNLSASP 420 

FQATCNQSLL TSISTSVSYQ APNNTWLACT SGLTRCINGT EPGPLLCVLV HVLPQVYVYS 480 

GPEGRQLIAP PELHPRLHQA VPLLVPLLAG LSIAGSAAIG TAALVQGETG LISLSQQVDA 540 

DFSNLQSAXD ILHSQVESLA EWLQNCRCL DLLFLSQGGL CAALGESCCF YANQSGVXKG 600 

TVKKVRENLD RHQQERENNI PWYQSMFNWN PWLTTLITGL AGPLLILLLS LIFGPCILNS 660 
FLNFIKQRIA SVKLTYLKTQ YDTLVKN 

SEGIDNO:59 PCQ1 ONA SEQUENCE 

Nucleic Add Accessions NM_019005 

Coding sequence: 182-1 885 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I i I I i I 

TGATGGTGGA AATTTCTTGA AACCGCTCTC GTAATTTGCC ACGTGCTGTT GCAAATATTC 60 

TGGTGAATGA ACACAGAATC AGCATGGCTT TCCTTTGCTG AGAAATCACT GATGGGAAGT 120 

GAGACTTGTT AAACTTGAAA GTGAATGGAC CTGAGTGGAC CCTTTGATCA CATCAGTAAA 180 

CATGAGCGGT ACCAAACCTG ATATTTTATG GGCACCACAC CATGTTGATA GATTTGTTGT 240 

GTGTGACTCA GAACTAAGTC TTTATCATGT GGAATCTACT GTGAATTCAG AACTCAAAGC 300 

TGGATCTTTA CGTTTATCTG AAGACTCTGC AGCTACATTA CTGTCAATAA ATTCAGATAC 360 

ACCCTATATG AAATGTGTTG CCTGGTATCT TAATTATGAT CCTGAATGTC TGCTGGCAGT 420 

TGGACAAGCA AATGGTCGAG TTGTACTTAC AAGCCTTGGT CAAGATCATA ACTCAAAGTT 480 

CAAAGATTTG ATAGGAAAAG AGTTTGTTCC AAAACATGCA CGACAATGTA ATACCCTTGC 540 

CTGGAATCCA CTGGATAGTA ACTGGCTAGC TGCTGGTTTA GATAAGCACA GAGCTGACTT 600 

TTCAGTGCTA ATATGGGATA TCTGCAGCAA ATATACTCCT GATATAGTTC CCATGGAAAA 660 

AGTGAAACTT TCAGCAGGTG AAACTGAAAC AACATTATTA GTAACAAAAC CACTTTATGA 720 

GTTAGGACAG AATGATGCTT GTCTGTCTCT TTGTTGGCTT CCACGAGACC AGAAACTTCT 780 

CCTTGCTGGT "ATGCATCGTA ACCTAGCTAT ATTTGATCTT CGGAATACAA GCCAAAAGAT 840 

GTTCGTAAAT ACAAAAGCTG TTCAGGGTGT GACGGTAGAC CCATATTTCC ACGATCGTGT 900 

TGCTTCCTTC TATGAAGGTC AGGTTGCAAT ATGGGATCTT AGAAAATTTG AGAAGCCAGT 960 

TTTGACATTG ACTGAGCAAC CAAAACCCTT AACAAAAGTA GCATGGTGTC CCACTAGGAC 1020 

TGGTCTACTT GCCACTTTAA CAAGGGATAG TAATATTATT AGATTGTATG ATATGCAGCA 1080 

TACACCCACT CCCATTGGGG ATGAAACTGA ACCCACAATA ATTGAAAGAA GTGTGCAACC 1140 

TTGTGACAAT TACATTGCTT CCTTTGCGTG GCATCCAACA AGTCAAAATC GAATGATAGT 1200 

TGTAACTCCC AACCGAACAA TGTCAGACTT CACTGTTTTT GAAAGGATAT CTCTTGCCTG 1260 

GAGCCCAATT ACATCTTTAA TGTGGGCTTG TGGTCGTCAT TTATATGAAT GTACGGAAGA 1320 

AGAAAATGAT AATTCTTTAG AAAAAGATAT AGCAACGAAG ATGCGTCTTC GGGCTTTATC 1380 

AAGGTATGGA CTTGATACAG AGCAGGTGTG GAGGAACCAC ATTTTAGCTG GAAATGAAOA 1440 

TCCACAGCTC AAGTCACTCT GGTATACTCT GCACTTTATG AAGCAATACA CAGAAGATAT 1500 

GGATCAGAAA TCTCCAGGCA ACAAAGGATC ATTGGTTTAT GCAGGAATTA AATCAATTGT 1560 

AAAGTCATCG TTGGGAATGG TGGAAAGCAG CAGACATAAT TGGAGTGGGT TGGATAAGCA 1620 

AAGTGATATT CAAAACTTAA ATGAAGAGAG AATCTTAGCT TTACAGCTTT GTGGGTGGAT 1680 

AAAGAAAGGA ACGGATGTAG ACGTGGGGCC ATTTTTGAAC TCCCTTGTAC AAGAAGGGGA 1740 

ATGGGAAAGA GCTGCTGCTG TGGCATTGTT CAACTTGGAT ATTCGCCGAG CAATCCAAAT 1800 

CCTGAATGAA GGGGCATCTT CTGAAAAAGG CAGGAGATCT GAATCTCAAT GTGGTAGCAA 1860 

TGGCTTTATC GGGTTATACG GATGAGAAGA ACTCCCTTTG GAGAGAAATG TGTAGCACAC 1920 

TGCGATTACA GCTAAATAAC CCGTATTTGT GTGTCATGTT TGCATTTCTG ACAAGTGAAA 1980 

CAGGATCTTA CGATGGAGTT TTGTATGAAA ACAAAGTTGC AGTACGTGAC AGAGTGGCAT 2040 

TTGCTTGTAA ATTCCTTAGT GATACTCAGA TACATCGAAA AGTTGACCAA TGAAATGAAA 2100 

GAGGCTGGAA ATTTGGAAGG AATTTTGCTT ACAGGCCTTA CTAAAGATGG AGTGGACTTA 2160 

ATGGAGAGTT ATGTTGATAG AACTGGAGAT GTTCAAACAG CAAGTTACTG TATGTTACAG 2220 

GGTTCACCTT TAGATGTTCT TAAAGATGAA AGGGTTCAGT ACTGGATTGA GAATTATAGA 2280 

AATTTATTAG ATGCCTGGAG GTTTTGGCAT AAACGAGCTG AATTTGATAT TCACAGGAGT 2340 

AAGTTGGATC CCAGTTCCAA GCCTTTAGCA CAAGTTTTTG TGAGTTGCAA TTTCTGTGGC 2400 

AAGTCAATCT OCTACAGCTG TTCAGCTGTG CCTCATCAGG GCAGAGGTTT TAGTCAGTAT 2460 

GGTGTGAGTG GCTCACCAAC GAAATCTAAA GTCACAAGTT GTCCTGGCTG TCGAAAACCA 2520 

CTTCCTCGAT GTGCGCTTTG TCTCATTAAT ATGGGAACAC CAGTTTCTAG CTGTCCTGGA 2580 
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GGAACCAAAT CAGATGAAAA AGTGGACTTG 
AACTGGTTTA CATGGTGTCA TAATTGCAGG 
TGGTTCAGGG ACCATGCAGA GTGCCCTGTG 
GATACAACGG GGAATCTGGT ACCTGCAGAG 
AGAGAACCCT TCAAGTGTGG AGCTTTCTAG 
TCAGAACAAG CCATTCATGA CTTACCTGTA 
AAAAAAAAAA AAAAAAAAAA 

SEQ (0 NQ:60 PCQ1 Protein sequence 
Protein Accession*: NP_061878 

1 11 21 31 41 51 

I I I I I I 

MSGTKPDILW APHHVDRFW CDSELSLYHV ESTVNSELKA GSLRLSEDSA ATLLSINSDT 60 
PYMKCVAWYL NYDPECLLAV GCANGRWLT SLGQDHNSKF KDLIGKEFVP KHARQCNTLA 120 
WNPLDSNWLA AGLDKHRADF SVLIWDICSK YTPDIVPMEK VKLSAGETET TLLVTKPLYE 180 
LGQNDACLSL CWLPRDQKLL LAGMHRNLAI FDLRNTSQKM FVNTKAVQGV TVDPYFHDRV • 240 
ASFYEGQVAI WDLRKFEKPV LTLTEQPKPL TKVAWCPTRT GLLATLTRDS NIIRLYDMQH 300 
TPTPIGDETE PTIIERSVQP CDNYIASFAW HPTSQNRMIV VTPNRTMSDF TVFERI SLAW 360 
SPITSLMWAC GRHLYECTEE ENDNSLEKDI ATKMRLRALS RYGLDTEQVW RNHILAGNED 420 
PQLKSLWYTL HFHKQYTEDM DQKSPGNKGS LVYAGIKSIV KSSLGMVESS RHNWSGLDKQ 480 
SDIQNLNEER ILALQLCGWI KKGTDVDVGP FLNSLVQEGE WEHAAAVALP NLDIRRAIQI 540 
LNEGASSEKG RRSESQCGSN GFIGLYG 

SEQ !D N0:61 P0G3 DNA SEQUENCE 

Nucleic Add Accession* U42359 

Coding sequence: 563-775 (uncertified sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

TTGTACATCT TAACAACCTT AAGCTGTACA AATAGANCAA TAATATCTAA ATGGTGTGAT 60 

GATCAGCCCA CAGTACACAT CATTGATGAG AATTTCACTG GTCTCAACCT TTCTCATGCT 120 

GAGTCCTGGC TTTGTAAAAT GACTTATAAA GGTCCAAGGA TTTAGAGATG ATTAAGAGAT 180 

AAGCTGGCAT TCTGTAAAGG CACCATCGTC TATCCCCTGT CTTATCTAGA TAAAGAATGT 240 

AGTGCTAAAT CTTGTAATAA TATTGTACAA ATGGAAATTC AATCTTAAGG ATTATTTTTT 300 

CCATATTGTT GTATTTCATT GTGGTGTATT GGAAAGTGAT CTGGACTTTG AGTGAGAAGA 360 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GGCAAGTCTT TTAATCTTCT 420 

CTGAGCCTCA GTTTTCCTCA TTTTTCAAAT ATAGAGAGTA TAACATTTAT CTCATAAGAC 480 

AAGTTGTAGT AAATTACTGT TTTACAAATG TAAGATAACT TTTAACTGTG AGATTCCATA 540 

TTCCAGTCTT ACATTATTAT GT TTATCTGC CACAGGGAGA AGTCCTCAGA TAAAAATGTC 600 

TACCAAAAGA CTGACACGTG GAGTTAATCA TTTGACAGAT GCAAATGCTT CCACCCXXAA 660 

CAAATATACT TTCTTTAACT TCTGTGTGGG TATCACTTAG GGAAAAAAAG GCAGGCAACA 720 

AAATATTTTT TAATTCTATC TTAGGAAAAA TTGTAGNCAA ATCTTTTTNT CCCATTAACA 780 

AATAATGTAA GCCTTAATAT TCAAGGGGTA ATAAAAATAC AAAGTCTTCC AAACAGGTAA 840 
CTTACTTGAA AACTTT 

SEQ 10 N0:62 PDG3 Protan sequence 
Protein Accession #: AABl 8375 

1 11 21 31 41 51 

I I I I 1 I 

MGARGAPSRR RQAGRRLRYL PTGSFPPXLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 
SRRSIFRMNG DKFRKFIKAP PRNYSMIVMF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 120 
AFCMKLFFSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 
WXADRTDVHI RVFRPPKYSG TIALALLVSL VGGLLYXRRN NLEFIYNKtG WAMVSLCIVF 240 
AMTSGOMWNH 1RGPPYAHKN PHNGQVSYIH GSSOAQFVAE SHIILVLNAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY PYSDLDFE 

SEQ (0 NO:63 PDG8 DNA SEQUENCE 

Nucleic Acid Accession*: AL080235 

Coding sequence: 245453 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGTCGCCGCA CCGGCCGCCT CCGGCCCGCC GCCGCCCCCA GCGCCGCCGC CGCCACCGCC 60 

GGGGCGCCCA CCGCGCTGCC AGCCTACCCC GCGGCCGAGC CGCCCGGGCC GCTGTGGCTG 120 

CAGGGCGAGC CGCTGCATTT CTGCTGCCTA GACTTCAGCC TGGAGGAGCT GCAGGGCGAG 180 

CCGGGCTGGC GGCTGAACCG TAAGCCCATT GAGTCCACGC TGGTGGCCTG CTTCATGACC 240 

CTGGTCATCG TGGTGTGGAG CGTGGCCGCC CTCATCTGGC CGGTGCCCAT CATCGCCGGC 300 

TTCCTGCCCA ACGGCATGGA ACAGCGCCGG ACCACCGCCA GCACCACCGC AGCCACCCCC 360 

GCCGCAGTGC CCGCAGGGAC CACCGCAGCC GCCGCCGCCG CCGCCGCTGC CGCCGCCGCC 420 

GCGGCCGTCA CTTCGGGGGT GGCGACCAAG TGACCCGCTC CGCTCCTCXX: TGTGTCCGTC 480 

CTGTGTCCGC GCGCGCGGGT GCCTTTCCCG CCGGGGACTC GGCCGGTGTG CTTCGTGCTG 540 

TAGTTATCGT TAGTTCCTCT TCCCGAGATG GGGCCGCCGA GAGACCCCAG CGCCTTTGAA 600 

AAGCAAGGTT TGTGCTGCGC TTCCAGTTCC GAAAAGCAGA TGTTTAAGCC CTTGGACTGA 660 

GGGTGGGATC GCAGCTCCGA AGACGGAGAG GAGGGAAATG GGGCCCTTTC CCCTCTATTG 720 

CATCCCCCTG CCCGACTCCT TCCCCGCACC CACGTGCCCT AGATTCATGG CAGAAAATGA 780 

CCAAATCCTG TCTATTTGTT TTATATATTT AATAACTGTT TTAAATGAAA GTTTTAGTAA 840 

AAAAAATACA AAACAAAAAG ATTAAATTGC TATTGCTGTA GTAAGAGAAG CTCTTTGTAT 900 

CTGAACATAG TTGTATTTGA AATTTGTGGT TTTTTAATTT ATTTAAAATT GGGGGGAGGG 960 



AGCAAGGACA AAAAATTAGC CCAATTTAAC 2640 

CACGGTGGAC ATGCTGGACA TATGCTTAGT 2700 

TCTGCATGCA CGTGTAAATG TATGCAGTTG 2760 

ACTGTCCAGC CATAAAATGT TACCACCTTA 2820 

TAGGTGTCCT TCATAGCTCA GAAACATACC 2880 

ATGGGAAAAT AAATCATTCT ATCAGAAAAA 2940 
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5 



CATGGGAAGG ATTTAACACC GATATATTGT TACCGCTGAA AATGAACTTT ATGAACCTTT 1020 
TCCAAGTTGA TCTATCCAGT GACGTGGCCT GGTGGGCGTT TCTTCTTGTA CTTATGTGGT 1080 
TTTTTGGCTT TTAATACAGA CATTTTCCTC CAAAAAAAAA AAAAAAAAGG 

-SEQPMO^ppCTPwi^wqwnw 
Protein Accession #: CAB45781 



1 11 21 31 41 51 

in 1 I I I 1 1 

lU GRRTGRLRPA AAPSAAAATA GAPTALPAYP AAEPPGPLWL QGEPLHFCCL DFSLEELQGE 60 

PGWRLNRKPI ESTLVACFMT LVIWWSVAA LIWPVPIIAG FLPNGMEQRR TTASTTAATP 120 
AAVPAGTTAA AAAAAAAAAA AAVTSGVATK 

4 SEQID NO:65 P0M1 DNA SEQUENCE 

15 Nucleic Acid Accession ft NM_006765 

Coding sequence: 149-1 1 95 (underlined sequences correspond to start and stop codons) 

„ . 1 11 21 . 31 41 51 

20 | | | | | | 

CGGOCGCGGC CCGGGTCCCT CGCAAAGCCG CTGCCATCCC GGAGGGCCCA GCCAGCGGGC 60 

TCCCGGAGGC TGGCCGGGCA GGCGTGGTGC GCGGTAGGAG CTGGGCGCGC ACGGCTACCG 120 

CGCGTGGAGG AGACACTGCC CTGCOGC GAT GG GGGCCCGG GGCGCTCCTT CACGCCGTAG 180 

GCAAGCGGGG CGGCGGCTGC GGTACCTGCC CACCGGGAGC TTTCCCTTCC TTCTCCTGCT 240 

25 GCTGCTGCTC TGCATCCAGC TCGGGGGAGG ACAGAAGAAA AAGGAGAATC TTTTAGCTGA 300 

AAAAGTAGAG CAGCTGATGG AATGGAGTTC CAGACGCTCA ATCTTCCGAA TGAATGGTGA 360 

TAAATTCCGA AAATTTATAA AGGCACCACC TCGAAACTAT TCCATGATTG TTATGTTCAC 420 

TGCTCTTCAG CCTCAGCGGC AGTGTTCTGT GTGCAGGCAA GCTAATGAAG AATATCAAAT 480 

ACTGGCGAAC TCCTGGCGCT ATTCATCTGC TTTTTGTAAC AAGCTCTTCT TCAGTATGGT 540 

3U GGACTATGAT GAGGGGACAG ACGTTTTTCA GCAGCTCAAC ATGAACTCTG CTCCTACATT 600 

CAYGCATTTW CCTCCAAAAG GCAGACCTAA GAGAGCTGAT ACTTTTGACC TCCAAAGAAT 660 

TGGATTTGCA GCTGAGCAAC TAGCAAAGTG GATTGCTGAC AGAACGGATG TTCATATTCG 720 

GGTTTTCAGA CCACCCAACT ACTCTGGTAC CATTGCTTTG GCCCTGTTAG TGTCGCTTGT 780 

. TGGAGGTTTG CTTTATTNGA GAAGGAACAA CTTGGAGTTC ATCTATAACA AGACTGGTTG 840 

35 GGCCATGGTG TCTCTGTGTA TAGTCTTTGC TATGACTTCT GGCCAGATGT GGAACCATAT 900 

CCGTGGACCT CCATATGCTC ATAAGAACCC ACACAATGGA CAAGTGAGCT ACATTCATGG 960 

GAGCAGCCAG GCTCAGTTTG TGGCAGAATC ACACATTATT CTGGTACTGA ATGCCGCTAT 1020 

CACCATGGGG ATGGTTCTTC TAAATGAAGC AGCAACTTCG AAAGGCGATG TTGGAAAAAG 1080 

ACGGATAATT TGCCTAGTGG GATTGGGCCT GGTGGTCTTC TTCTTCAGTT TTCTACTTTC 1140 

40 AATATTTCGT TCCAAGTACC ACGGCTATCC TTATAGTGAT CTGGACTTTG AGTGAGAAGA 1200 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GCTTTTTAAT TAAATGAAGC 1260 

CAAGTGGGAT TTGCATAAAG TGAATGTTTA CCATGAAGAT AAACTGTTCC TGACTTTATA 1320 

CTATTTTGAA TTCATTCATT TCATTGTGAT CAGCTAGCTT ATTCTTGTGT ACTTTTTTTA 1380 

A - AACTGTGGGT TTTCCTAGTA AATTTAATTT ACAGAAATCA ATGGTAGCAT TTAGTAATCT 1440 

45 ACAAAGGAAA TATCAAAGTG TTTTTCAAGC CTGTTATATY CAGTGTGTKC CACAGGATTG 1500 
CAATAAATGA CAATGTAATT A 



SEQIDNOiBSL^ 
50 Protein Accession* 

1 11 21 31 41 51 

I I I I I I 

c MGARGAPSRR RQAGHRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 

55 SRRSIFRHNG DKFRKFIKAP PRNYSMIVMF TALQFQRQCS VCRQANEEYQ ILANSWRYSS 120 

AFCNKLFFSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGPAAEQLAK 180 

WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEFIYNKTG WAMVSLCIVF 240 

AMTSGQMWNH IRGPPYAHKN PHNGQVSYIH GSSQAQFVAE SHIILVLNAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY PYSDLDFE 



60 
65 



SEQ ID N0:67 PDM2 DNA SEQUENCE 

Nucleic Acid Accession #: NM_000947 

Coding sequence: 88-1617 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GGTTTCATAT GAACTCTCCC GCCACCCGGG AACAGCTGGC TGCCACCGTT TGTGTTTTCC 60 

70 GAGTTTGTAT TCTTGCAGGT GACCAAGATG GAGTTTTCTG GAAGAAAGCG GAGGAAGCTG 120 

AGGTTGGCAG GTGACCAGAG GAATGCTTCC TACCCTCATT GCCTTCAGTT TTACTTGCAG 180 

CCACCTTCTG AAAACATATC TTTAACAGAA TTTGAAAACT TGGCTATTGA TAGAGTTAAA 240 

TTGTTAAAAT CAGTTGAAAA TCTTGGAGTG AGCTATGTGA AAGGAACTGA ACAATACCAG 300 

AGTAAGTTGG AGAGTGAGCT TCGGAAGCTC AAGTTTTCCT ACAGAGAGAA GCTAGAAGAT 360 

75 GAATATGAAC CACGAAGAAG AGATCATATT TCTCATTTTA TTTTGCGGCT TGCTTATTGC 420 

CAGTCTGAAG AACTTAGACG CTGGTTCATT CAACAAGAAA TGGATCTCCT TCGATTTAGA 480 

TTTAGTATTT TACCCAAGGA TAAAATTCAG GATTTCTTAA AGGATAGCCA ATTGCAGTTT 540 

GAGGCTATAA GTGATGAAGA GAAGACTCTT CGAGAACAGG AGATTGTTGC CTCATCACCA 600 

AGTTTAAGTG GACTTAAGTT GGGGTTCGAG TCCATTTATA AGATCCCTTT TGCTGATGCT 660 

80 CTGGATTTGT TTCGAGGAAG GAAAGTCTAT TTGGAAGATG GCTTTGCTTA CGTACCACTT 720 

326 



WO 02/30268 



PCT/US01/32045 



5 
10 
15 
20 
25 



AAGGACATTG 
TTAACAGCCA 
CACCTCAGTC 
TCTTTAGATC 
CATAAAGCCT 
TTTCTGAAGG 
ATCAAAGGAA 
AGCTTTGGAA 
CTGTCCAATC 
GAGCTGCTGA 
TTGGATTTAG 
CACAATGTGG 
CAACGTATTC 
CAACCCAAAC 
TCCTCTCTGG 
GTTTTATAAC 
TTGAAAAAGG 
AGCCTTGACC 
CACAGGTGTG 
GTCTCCCTAT 
AGCGTCCCAG 
TAACCTTTTC 
TTATTAGGAA 
AAGGAAAGAG 
TTTTAGGAGA 
AACAACTTTT 
ATTTTTGTTA 



TGGCAATCAT 
GGTCCTTGCC 
ATTCCTACAC 
AGATTGATTT 
TGCGGGAAAA 
GCATTGGTTT 
AGATGGATCC 
AGGAAGGCAA 
CACCAAGCCA 
AGCAAAAGTT 
TAAAGGGGAC 
ATGATTGTGG 
TAAATGGTGG 
CAAGTGTCCA 
AAATGGATAT 
CCTTTTTCCT 
GTTTCACTGT 
TTCCCAGCTC 
CACCTCATAT 
GTTGCCCAGG 
AGTGCTGGGA 
GTTTAACTTC 
AGGAGGTTTG 
GAGGAGTTTC 
TAAAAACAGC 
GTTTTAACTC 
ATAAATATCA 



CCTGAATGAA 
TGCTGTGCAG 
TGGCCAAGAT 
GCTTTCTACC 
TCACCATCTT 
AACTTTGGAA 
AGACAAGTTT 
GAGGACAGAC 
AGGGGATTAT 
GCAGTCATAC 
ACATTACCAG 
CTTTTCTTTG 
TAAAGACATA 
GAAAACCAAG 
GGAAGGACTA 
CAATAGCCTG 
CACCAAGGCT 
AAGTGATCCT 
CCAGATAATT 
CAGATCTCAG 
TTACAGTTGT 
TCTCTTCACT 
AGGTAACAAC 
TATTAAAATC 
TTTGGGGACT 
TTAATCACTT 
AAGTGT 



TTTAGAGCCA 
TCTGATGAAA 
TACAGTACCC 
AAATCCTTCC 
CGTCATGGAG 
CAGGCATTGC 
GATAAAGGTT 
TATACACCTT 
CATGGGTGCC 
AAGATCTCTC 
GTAGCCTGTC 
AATCATCCTA 
AAGAAGGAAC 
GATGCATCAT 
GAAGATTACT 
TTTCCTGTTT 
TAGTGCAGTG 
CCTACCTCAG 
TTTTTCAATT 
ACTCCTGGGC 
GAGCCACTGT 
GCATCCCAAT 
AGAGACTTTC 
TGTCACTTGA 
GGTTAAAGTC 
TGTAATTTTG 



AACTGTCCAA 
GACTTCAGCC 
AGGGAAATGT 
CACCTTGCAT 
GCCGAATGCA 
AGTTCTGGAA 
ACTCTTACAA 
TCAGTTGCCT 
CATTCCGTCA 
CTGGAGGGAT 
AAAAATACTT 
ATCAGTTCTT 
CTATCCAACC 
CTGCTCTGGC 
TTAGTGAAGA 
TTAAGATTTT 
ACACAATTAC 
CCTCCCAAGT 
TTTTTTTGTA 
TCAAGCGATC 
GCCTGGCCTT 
CCATCTACAG 
ACTATATTTT 
GTGATGTCAT 
CCCCAGAAAC 
ACTCAATCCT 



GGCTTTGGCA 
TCTGCTCAAT 
TGGGAAGATT 
GCGTCAGTTA 
GTATGGCCTA 
GCAAGAATTT 
CATCCGTCAC 
GAAGATTATT 
CAGTGATCCA 
AAGCCAGATT 
TGAGATGATA 
TTGTGAGAGC 
AGAAACTCCT 
CTCTTTAAAT 
TTC TTAGG CA 
GCCTTTGTTG 
AGCTGATTGC 

agttaggaca 
gaggtggggg 
ctcacacctc 
tttttttttt 

GCATGCACAC 
GCTTTGACAG 
TTAAGTCCTA 
TACAATAAAG 
TTTCTGGACC 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



SEQ ID NO:68 PDM2 Protein sequence: 
Protein Accession*: NP_000938 



VSYVKGTEQY 
IQQEMDLLRF 
ES1YKIPFAD 
QSDERLQPLL 
LRHGGRMQYG 
DYTPFSCLKI 
QVACQKYFEM 
KDASSALASL 



11 
I 

LRLAGDQRNA 
QSKLESELRK 
RFSILPKDKI 
ALDLFRGRKV 
NHLSHSYTGQ 
LFLKG1GLTL 
ILSNPF5QGD 
IHNVDDCGFS 



21 
I 

SYPHCLQFYL 
LKFSYREKLE 
QDFLKDSQLQ 
YLEDGFAYVP 
DYSTQGNVGK 
EGALQFWKQE 



31 
I 

QPPSENISLT 
DEYEPRRRCH 
FEAISDEEKT 
LKDIVAIILN 
ISLDQIDLLS 
FIKGKMDPDK 
PELLKQKLQS 
SQRILNGGKD 



41 

I 

EFENLAIDRV 
ISHFILRLAY 
LREQEIVASS 
EFRAKLSKAL 
TKSFPPCMRQ 
FDKGYSYNIR 
YKISPGGISO 
IKKEPIQPET 



51 
I 

KLLKSVENLG 
CQSEELRRWF 
PSLSGLKLGF 
ALTARS h PAV 
LHKALRENHH 



ILDLVKGTHY 
PQPKPSVQKT 



1 
I 

AATTCATACA 
GTCTCGGCTC 
GTGTGGGAAG 
AGAGAAGCCC 
TGCACATCAG 
CTTCATTCAG 
TATATGCAAT 
TACTCACACT 
GACATGTTTA 
GTGTGGAAAA 
AGAGAAACCC 
CAGACATCGG 
TTTCTCCCAC 
TGTAGGTTCA 
TGATCTCATA 
AGCTCAGACC 
CCAGCCTGTT 
ATATGAATGC 
AACACAGAGG 
TAATAAGCAT 
GAAGATAGAT 
GAAATATAAT 
GGTTTACACA 
CTAGTGGTAC 
GTAACTAGAA 
AAGGAGTATT 
AGGATGTGTA 
AAAAGGGGTT 
CCCTTTTTTG 
TTTCTTTGAT 



11 
I 

GGAGAGAAGT 
ATTAATCATC 
GCCTTCTCCA 
TATGAATGCA 
AAAGCTCACA 
AAGGGAAATC 
GAATGTGGAA 
GGAGAGAAAC 
_ATATCCCATC 
TCCTGCTCAC 
TATACATGCA 
AGAACTCATA 
TTGTCATGCC 
GTCAAATTGG 
CAGGATAAAG 
TCATTAACTA 
GCCAGAAGTT 
AGTGAATGTG 
AACAAACTGA 
ATACTCAGAG 
CTTCTCATCA 
GATCATGGAA 
GGAGAGAAAC 
ATTCTGCCTT 
CATCTTCATC 
TTAGAGATTT 
TTTTAGGACA 
GTCAGTGTTA 
ATAAGAGTCT 
TCCAAATTTC 



21 
I 

CATATATATG 
AGAGAGTTCA 
AAAGGTCCAG 
CTGAATGTGA 
CAGGAGAGAA 
TCATTGTACA 
AAGGCTTCAT 
CCTATGAATG 
AGAGATTTCA 
ACAAGTCAGG 
GTGACTGTGG 
CAGGGGAGAG 
TTGTTTATCA 
AAAATCCTTG 
ACTCTGTTAA 
ACAGTGCGTT 
CAGTCTCAGC 
GTAGTGCTTT 
TATATTCAAG 
AAAAATAGTA 
GTGACCATAG 
AAGTCCTTGT 
TTTTGGAAGA 
ATCCTCAGAG 
AAAATATGAA 
CGATCAGAAA 
ATATACCTTG 
CACATCATTG 
TCTATTCCCA 
TTCACTTGTT 



31 
I 

CAGTGATTGT 
TACAGGAGAG 
GCTCACTGAA 
CAAAGCATTC 
GTCATATATA 
TCAGCGAATT 
CCAAAAGGGC 
CAATGAATGT 
CACAGGAAAG 
TCTCATTAAC 
GAAAGCTTTC 
ACCGTATGGA 
TAAGGGAATG 
CTCAGAGAGT 
CATGGTGACT 
CCAAGCAGAG 
AGATAGTAGA 
CAGTGATCAA 
GTGGAAAGCC 
TGAAGTGGAG 
ATCACATCTT 
TCAGAAACAG 
CCTTTGAAGG 
GG AATCATAT 
AGAACACACG 
TCTAACATCA 
AATCACTAGT 
GTTAAATTTA 
ACCAAGATCA 
ATTTCAGACT 



41 
I 

GGAAAAGGCT 
AAACCACATG 
CACCAGAGAA 
CGCTGGAAAT 
TGCCGTGATT 
CATACTGGAG 
AACCTCCTTA 
GGGAAAGGCT 
ACACCCTTTG 
CACCAGAGAA 
AGAGATAAAT 
TGCTCTGATT 
CTGCATGCAA 
CATAGCTTAT 
CTGCAGATGC 
AGCAAAGTAG 
ATTTGCACAG 
TTACATCATA 
CTTGAATAAA 
ACTGGGAAAT 
CAGTGAGCTT 
TACGCCAGTA 
CTATGAATGT 
AGAAATAAAA 
AAGCAAATAA 
TTATATGGCA 
TGATATGTCA 
TAGCACAATG 
TTATATGATT 
ACTGAAGCTC 



51 
I 

TCATCAAGAA 
GATGCAGCCT 
CTCATACAGG 
CACAGCTCAA 
GTGGAAAAGG 
AAAAACCCTA 
TTCATCGACG 
TCAGCCAGAA 
TATGTACTGA 
TTCACACAGG 
CATGTCTCAA 
GTGGGAAAGC 
GAGAGAAATG 
CACATACACG 
CTTCTGTGGC 
CCATTGTGAG 
AATAAAAACC 
TGTCACAAAA 
ACCTTATGGC 
TCTTTTATGG 
ATAGTTGGTA 
GGTATCAGGG 
GGCAGGGTTG 
CTATGAAAAT 
GCCCTGTGAA 
GATAATATAC 
ATGACTAATT 
TACCTCTTCC 
AGCTCTTGTG 
TTCAAAAGGA 



60 
120 
180 
240 
300 
360 
420 
480 



SEQ ID KO:69 PDM3 DMA SEQUENCE 

Nucleic Add Accession*: NM_024840 

Coding sequence: 108-491 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 . 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



327 



WO 02/30268 



PCT/US01/32045 



5 



35 
40 



AAAATGTATT TAATTTAATA ATGTAACACA ACAAGTTTGG ATGTGTTTAA CTTTATAAAT 1860 
AATCACCCCA GAGGAATGAA GTTCAAAACT TGTGAATAAC C 



SEP ID NO:70 PDM3 Protetq spgyencft; 
Protein Accession*: NP_079l 16 



1 11 21 31 41 51 

in I I I I I I 

10 MDAACVGRPS PKGPGSLNTR ELIQERSPMN ALNVTKHSAG NHSSMHIRKL TQERSHIYAV 60 

IVEKASFRRE ISLYISEFIL EKNPIYAMNV EKASSKRATS LFIDVLTLER NPMNAMNVGK 120 
ASAKRHV 

. c SEQ 10 N0:71 POMS DNA SEQUENCE 

15 Nucleic Add Accession*: NMJ018455 

Coding sequence: 341-955 (undefined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

on I I I II I 

ZU AATTTCGGCA CGGGGGGGAG GCACAGTGAG TCCACTGGGG CACGGCAGCG TCTAAGCCAC 60 

AAGCCGACTG ACATAAGCCA GGTCCTAACG GAGCCTATGT GTAAGTCCAC TACTGGTGCA 120 

AGGTTGCACA CTTCTAAGAA GAGCGGCGTG GGGGGCTCGG CGACCTTCGC TTCAGTCGCT 180 

CCCCCGTGCA GTCCCCTGTG CCCAAGACAC AGCCTGATGC TTGTGCTCCG GTGGGCGGAC 240 

- TTGGAGGCGG CGGGAACTGC AATTGGTGGC TTTGAAGGGC GGCGAGCGGG AACAGCTCTT 300 

25 GAGGAGTGAG ACTGCAGGAG ATGTGGGCCG TGCCAAAGAG ATGGATGAGA CTGTTGCTGA 360 

GTTCATCAAG AGGACCATCT TGAAAATCCC CATGAATGAA CTGACAACAA TCCTGAAGGC 420 

CTGGGATTTT TTGTCTGAAA ATCAACTGCA GACTGTAAAT TTCCGACAGA GAAAGGAATC 480 

TGTAGTTCAG CACTTGATCC ATCTGTGTGA GGAAAAGCGT GCAAGTATCA GTGATGCTGC 540 

CCTGTTAGAC ATCATTTATA TGCAATTTCA TCAGCACCAG AAAGTTTGGG ATGTTTTTCA 600 

30 GATGAGTAAA GGACCAGGTG AAGATGTTGA CCTTTTTGAT ATGAAACAAT TTAAAAATTC 660 

GTTCAAGAAA ATTCTTCAGA GAGCATTAAA AAATGTGACA GTCAGCTTCA GAGAAACTGA 720 

GGAGAATGCA GTCTGGATTC GAATTGCCTG GGGAACACAG TACACAAAGC CAAACCAGTA 780 

CAAACCTACC TACGTGGTGT ACTACTCCCA GACTCCGTAC GCCTTCACGT CCTCCTCCAT 840 

GCTGAGGCGC AATACACCGC TTCTGGGTCA GGAGTTAGAA GCTACTGGGA AAATCTACCT 900 

CCGAGAAGAG GAGATCATTT TAGATATTAC CGAAATGAAG AAAGCTTGCA A TTAGT GAAC 960 
ATGAAAGGAA AATAAAAATT CCTCACAGTC AAAAAAAAAA AAAAA 



SEQ 10 NQ:72P0MgPf0lelnseqyw; 
Protein Accession*: NPJKS0925 



1 11 21 31 41 51 

I I I I I I 

MDETVAEFIK RTIL»KI PMNE LTTILKAWDF LSENQLQTVN FRQRKESWQ HLIHLCEEKR 60 

AC ASISDAALLD IIYMQFHQHQ KVWDVFQMSK GPGEDVDLFD MKQFKNSFKK ILQRALKNVT 120 

45 VSFRETEEHA VWIRIAWGTQ YTKPKQYKPT YWYYSQTPY AFTSSSMLRR NTPLLGQELE 180 

ATGKIYLRQE EIILDITEMK KACN 

SEQ ID NO:73 PDM9 DNA SEQUENCE 

Nucleic Add Accession*: NMJD16192 
5U Coding sequence: 1-1 125 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGTGCTGT GGGAGTCCCC GCGGCAGTGC AGCAGCTGGA CACTTTGCGA GGGCTTTTGC 60 

55 TGGCTGCTGC TGCTGCCCGT CATGCTACTC ATCGTAGCCC GCCCGGTGAA GCTCGCTGCT 120 

TTCCCTACCT CCTTAAGTGA CTGCCAAACG CCCACCGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT CCTCTGTGAC ACCAACACCT GTAAATTTGA TGGGG AATGT 240 

TTAAGAATTG GAGACACTGT GACTTGCGTC TGTCAGTTCA AGTGCAACAA TGACTATGTG 300 

CCTGTGTGTG GCTCCAATGG GGAGAGCTAC CAGAATGAGT GTTACCTGCG ACAGGCTGCA 360 

60 TGCAAACAGC AGAGTGAGAT ACTTGTGGTG TCAGAAGGAT CATGTGCCAC AGATGCAGGA 420 

TCAGGATCTG GAGATGGAGT CCATGAAGGC TCTGGAGAAA CTAGTCAAAA GGAGACATCC 480 

ACCTGTGATA TTTGCCAGTT TGGTGCAGAA TGTGACGAAG ATGCCGAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAACCAAC TTCAATCCCC TCTGCGCTTC TGATGGGAAA 600 

TCTTATGATA ATGCATGCCA AATCAAAGAA GCATCGTGTC AGAAACAGGA GAAAATTGAA 660 

65 GTCATGTCTT TGGGTCGATG TCAAGATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATGCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TGCCAGAGAA 780 

CACCACATAC CTTGTCCGGA ACATTACAAT GGCTTCTGCA TGCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TGCAGGAGCC ATCTTGCAGG TGTGATGCTG GTTATACTGG ACAACACTGT 900 

GAAAAAAAGG ACTACAGTGT TCTATACGTT GTTCCCGGTC CTGTACGATT TCAGTATGTC 960 

70 TTAATCGCAG CTGTGATTGG AACAATTCAG ATTGCTGTCA TCTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGGA AATGCCCCAG AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAGGGCAC 1080 
TACAGTTCAG ACAATACAAC AAGAGCGTCC ACGAGGTTAA TCTGA 



328 



WO 02/30268 



PCT/US01/32045 



gEQIPTO74PM9P'ffltogWMrcff 
Protein Accessions NP_057276 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



i 
I 

1 KVLWESPRQC 
61 DRENDLFLCD 
121 CKQQSEILW 
181 VCNIDCSQTN 
241 HYARTDYAEN 
301 EKKDYSVLYV 
361 YSSDNTTRAS 



11 
I 

SSWTLCEGFC 
TNTCKFDGEC 
SEGSCATDAG 
FNPLCASDGK 
ANKLEESARE 
VPGPVRFQYV 
TRLI 



21 
I 

WLLLLPVMLL 
LRIGDTVTCV 
SGSGDGVHEG 
SYDNACQIKE 
HHIPCPEHYN 
LIAAVIGTIQ 



31 
I 

TVARPVKLAA 
CQPKCNNDYV 
SGETSQKETS 
ASCOKQEKIE 
GFCMHGKCEH 
IAVICVWLC 



41 
I 

FPTSLSDCQT 
PVCGSNGESY 
TCDICQFGAE 
VMSLGRCQDN 
SINMQEPSCR 
ITRKCPRSNR 



51 
I 

PTGWNCSGYD 
QNECYLRQAA 
CDEDAEDVWC 
TTTTTKSEDG 
CDAGYTGOHC 
IHRQKQNTGH 



SEQ ID NO:75 PD01 DNA SEQUENCE 

Nucleic Add Accession #: NM.014324 

Coding sequence: 89-1237 (underlined sequences correspond to start and stop codons) 



l 
I 

GGCGCCGGGA 
TTCCTTCAGC 
GTCCGGCCTG 
GGTACGCGTG 
CTCGCTAGTG 
GGTCGGATGT 
CAGAGATTCT 
AGTTCAGGAA 
TGTTCTCTCA 
TGACTTTGCT 
CACACGCACT 
AAGTTCTTTT 
CATGTTGGAT 
GGCTGTTGGA 
GTCTGATGAA 
TGCAGATGTA 
TGCCTGTGTG 
ACGGGGCTCG 
GTTAAACACC 
GGAGATACTT 
AATCATTGAA 
AATTTGAATA 
GAGGAACAGT 
CTACAGTGAT 
TGGGTACTTA 
TGATATTAAG 
TCTTGAAGAC 
AAATGCCACA 
GGCCTTTTGT 
TATCACACTT 
CTGAAAAAAA 
GGGACAGTCA 



TTCTGGATCT 
AAAAAAAAAA 



11 
I 

TTGGGAGGGC 
GGGGCACTGG 
GCCCCGGGCC 
GACCGGCCCG 
CTGGACCTGA 
GCTGCTGGAG 
GCAGCGGGAA 
AGCTTCTGCC 
AAAATTGGCA 
GGTGGTGGCC 
GACAAGGGTC 
CTGTGGAAAA 
GGTGGAGCAC 
GCAATAGAAC 
CTTCCCAATC 
TTTGCAAAGA 
ACTCCGGTTC 
TTTATCACCA 
CCAGCCATCC 
GAAGAATTTG 
AGTAATAAGG 
CTGCATTTAC 
ATTACAGTGT 
GATTGAATTC 
TACTAAATTA 
ATTCTTGACT 
ATCGATATAC 
AATTGTATGG 
CTTGGTGTTC 
TGTAATTTGC 
CATATCCAAA 
GTTTTAGGGT 
TCAGCTTTCC 
TATACCCAAC 
AAAAAAAAAA 



21 
I 

TTCTTGCAGG 
CAAGCGCCAT 
GTNTCTGTGC 
GCTCCCGCTA 
AGCAGCCGCG 
CCCTTCCGCC 
AATCCAAGGC 
GGTTAGCTGG 
GAAGTGGTGA 
TTATGTGTOC 
AGGTCATTGA 
CTCAGAAATC 
CTTTCTATAC 
CCCAGTTCTA 
AGATGAGCAC 
AGACGAAGGC 
TGACTTTTGA 
GTGAGGAGCA 
CTTCTTCCAA 
GATTCAGCCG 
TAAAAGCTAG 
AGTGTAGAGT 
CCTACCACTC 
TAAAAATGGT 
TGGTAGTTAT 
TATATTTTGA 
ATTTATTTAC 
TGATAAAAGT 
ATGATCTCCC 
AAAGAAAAGT 
ATAATGAGGA 
TGCCTGTATC 
TTTCTCCATG 
ACACAGCAAC 
AAAAAAAA 



31 
I 

CTGCTGGGCT 
GGCACTGCAG 
TATGGTCCTG 
CGACGTGAGC 
GGAGCCGCGT 
GCGGTGTCAT 
TTATTTATGC 
CCACGATATC 
GAATCCGTAT 
ACTGGGCATT 
TGCAAATATG 
GAGTCTGTGG 
GACTTACAGG 
CGAGCTGCTG 
GGATGATTGG 
AGAGTGGTGT 
GGAGGTTGTT 
GGACGTGAGC 
AGGGGATCCT 
AGAAGAGATT 
TCTCTAACTT 
AACACATAAC 
TAATCAAGAA 
TATCATTAGG 
TCTGCCTTCC 
ATGGGTTCTA 
ACTCTTGATT 
CACGTGAAAC 
TCTAAGCACA 
TTCACCTGTA 
AATGTGTTGG 
CAGTAACTCG 
TGTTTGATTT 
ATCCAGAAAT 



41 

I ' 

GGGGCTAAGG 
GGCATCTCGG 
GCTGACTTCG 
CGCTTGGGCC 
GCTGCGGCGT 
GGAGAAACTC 
CAGGCTGAGT 
AACTATTTGG 
GCCCCGCTGA 
ATAATGGCTC 
GTGGAAGGAA 
GAAGCACCTC 
ACAGCAGATG 
ATCAAAGGAC 
CCAGAAATGA 
CAAATCTTTG 
CATCATGATC 
CCCCGCCTTG 
TTCATAGGAG 
TATCAGCTTA 
CCAGGCCCAC 
ATTGTATGCA 
AAGAATTACA 
GCTTTTGATT 
AGTTTGCTTG 
GTGAAAAAGG 
CTACAATGTA 
AGAGTGATTG 
TTCCAAACTT 
TTGAATCAGA 
CTCACTACGT 
GGGCCTGTTT 
CTCCTCAGGC 
AAAGATCTCA 



51 
I 

GCTGCTCAGT 
TCGTGGAGCT 
GGGCGCGTGT 
GGGGCAAGCG 
CTGTGCAAGC 
CAGCTGGGCC 
GGATTTGGCC 
CTTTGTCAGG 
ATCTCGTGGC 
TTTTTGACCG 
CAGCATATTT 
GAGGACAGAA 
GGGAATTCAT 
TTGGACTAAA 
AGAAGAAGTT 
ACGGCACAGA 
ACAACAAGGA 
CACCTCTGCT 
AACACACTGA 
ACTCAGATAA 
GGCTCAAGTG 
TGGAAACATG 
GACTCTGATT 
TATAAAACTT 
ATATATTTGT 
AATGATATAT 
GAAAATGAGG 
GTTGCATCCA 
TAGCAACAGT 
ATGCCTTCAA 
AGAGTCCAGA 
CCCCGTGGGT 
TGGTAGCAAG 
GGACCCCCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



SEQlDNO:76f 
Protein Accession #: 



NPJ>55139 



1 11 21 31 41 51 

I I.I I I I 

1 MALQGXSWE LSGLAPGRXC AMVLADFGAR WRVDRPGSR YDVSRLGRGK RSLVLDLKQP 
61 REPRAAASVQ AVGCAAGALP PRCHGETPAG PRDSAAGKSK AYLCQAEWIW PVQESFCRLA 
121 GHDINYLALS GVLSKIGRSG ENPYAPLNLV ADFAGGGLMC ALGIIMALFD RTRTDKGQVT 
181 BANMVEGTAY LSSFLWKTQK SSLWEAPRGQ NMLDGGAPFY TTYRTADGEF MAVGAIEPQF 
241 YELLIKGLGL KSDELPNQMS TDDWPEKKKK FADVPAKKTK AEWCQIFDGT DACVTPVLTF 
301 EEWHHDHNK ERGSFITSEE QDVSPRLAPL LLNTPAIPSS KGDPFIGEHT EEILEEFGFS 
361 REEIYQLNSD KIIESNKVKA SL 

SEQ ID NO:77 P003 DNA SEQUENCE 

Nucleic Add Accession*: AB0289SI 

Coding sequence: 97-1 128 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I 1 I I I 

GTTAAATCCT TACTTTACCA GATTCTTGAT GGTATCCATT ACCTCCATGC AAATTGGGTG 60 

CTTCACAGAG ACTTGAAACC AGCAAATATC CTAGTAATGG GAGAAGGTCC TGAGAGGGGG 120 

AGAGTCAAAA TAGCTGACAT GGGTTTTGCC AGATTATTCA ATTCTCCTCT AAAGCCACTA 180 

GCAGATTTGG ATCCAGTAGT TGTGACATTT TGGTATCGGG CTCCAGAACT TTTGCTTGGT 240 

GCAAGGCATT ATACAAAGGC CATTGATATA TGGGCAATAG GTTGTATATT TGCTGAATTG 300 

TTGACTTCGG AACCTATTTT TCACTGTCGT CAGGAAGATA TAAAAACAAG CAATCCCTTT 360 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 



329 



WO 02/30268 



CATCATGATC AACTGGATCG GATATTTAGT GTCATGGGGT TTCCTGCAGA TAAAGACTGG 420 
GAAGATATTA GAAAGATGCC AGAATATCCC ACACTTCAAA AAGACTTTAG AAGAACAACG 480 
TATGCCAACA GTAGCCTCAT AAAGTACATG GAGAAACACA AGGTCAAGCC TGACAGCAAA 540 
GTGTTCCTCT TGCTTCAGAA ACTCCTGACC ATGGATCCAA CCAAGAGAAT TACCTCGGAG 600 
CAAGCTCTGC AGGATCCCTA TTTTCAGGAG GACCCTTTGC CAACATTAGA TGTATTTGCC 660 
GGCTGCCAGA TTCCATACCC CAAACGAGAA TTCCTTAATG AAGATGATCC TGAAGAAAAA 720 
GGTGACAAGA ATCAGCAACA GCAGCAGAAC CAGCATCAGC AGCCCACAGC CCCTCCACAG 780 
CAGGCAGCAG CCCCTCCACA GGCGCCCCCA CCACAGCAGA ACAGCACCCA GACCAACGGG 840 
ACCGCAGGTG GGGCTGGGGC CGGGGTCGGG GGCACCGGAG CAGGGTTGCA GCACAGCCAG 900 
GACTCCAGCC TGAACCAGGT GCCTCCAAAC AAGAAGCCAC GGCTAGGGCC TTCAGGCGCA 960 
AACTCAGGTG GACCTGTGAT GCCCTCGGAT TATCAGCACT CCAGTTCTCG CCTGAATTAC 1020 
CAAAGCAGCG TTCAGGGATC CTCTCAGTCC CAGAGCACAC TTGGCTACTC TTCCTCGTCT 1080 
CAGCAGAGCT CACAGTACCA CCCATCTCAC CAGGCCCACC GGTACTGACC AGCTCCCGTT 1140 
GGGCCAGGCC AGCCCAGCCC AGAGCACAGG CTCCAGCAAT ATGTCTGCAT TGAAAAGAAC 1200 
CAAAAAAATG CAAACTATGA TGCCATTTAA AACTCATACA CATGGGAGGA AAACCTTATA 1260 
TACTGAGCAT TGTGCAGGAC TGATAGCTCT TCTTTATTGA CTTAAAGAAG ATTCTTGTGA 1320 
AGTTTCCCCA GCACCCCTTC CCTGCATGTG TTCCATTGTG ACTTCTCTGA TAAAGCGTCT 1380 
GATCTAATCC CAGCACTTCT GTAACCTTCA GCATTTCTTT GAAGGATTTC CTGGTGCACC 1440 
TTTCTCATGC TGTAGCAATC ACTATGGTTT ATCTTTTCAA AGCTCTTTTA ATAGGATTTT 1500 
AATGTTTTAG AAACAGGATT CCAGTGGTGT ATAGTTTTAT ACTTCATGAA CTGATTTAGC 1560 
AACACAGGTA AAAATGCACC TTTTAAAGCA CTACGTTTTC ACAGACAATA ACTGTTCTGC 1620 
TCATGGAAGT CTTAAACAGA AACTGTTACT GTCCCAAAGT ACTTTACTAT TACGTTCGTA 1680 
TTTATCTAGT TTCAGGGAAG GTCTAATAAA AAGACAAGCG GTGGGACAGA GGGAACCTAC 1740 
AACCAAAAAC TGCCTAGATC TTTGCAGTTA TGTGCTTTAT GCCACGAAGA ACTGAAGTAT 1800 
GTGGTAATTT TTATAGAATC ATTCATATGG AACTGAGTTC CCAGCATCAT CTTATTCTGA 1860 
ATAGCATTCA GTAATTAAGA ATTACAATTT TAACCTTCAT GTAGCTAAGT CTACCTTAAA 1920 
AAGGGTTTCA AGAGCTTTGT ACAGTCTCGA TGGCCCACAC CAAAACGCTG AAGAGAGTAA 1980 
CAACTGCACT AGGATTTCTG TAAGGAGTAA TTTTGATCAA AAGACGTGTT ACTTCCCTTT 2040 
GAAGGAAAAG TTTTTAGTGT GTATTGTACA TAAAGTCGGC TTCTCTAAAG AACCATTGGT 2100 
TTCTTCACAT CTGGGTCTGC GTGAGTAACT TTCTTGCATA ATCAAGGTTA CTCAAGTAGA 2160 
AGCCTGAAAA TTAATCTGCT TTTAAAATAA AGAGCAGTGT TCTCCATTCG TATTTGTATT 2220 
AGATATAGAG TGACTATTTT TAAAGCATGT TAAAAATTTA GGTTTTATTC ATGTTTAAAG 2280 
TATGTATTAT GTATGCATAA TTTTGCTGTT GTTACTGAAA CTTAATTCTA TCAAGAATCT 2340 
TTTTCATTGC ACTGAATGAT TTCTTTTGCC CCTAGGAGAA AACTTAATAA TTGTGCCTAA 2400 
AAACTATGGG CGGATAGTAT AAGACTATAC TAGACAAAGT GAATATTTGC ATTTCCATTA 2460 
TCTATGAATT AGTGGCTGAG TTCTTTCTTA GCTGCTTTAA GGAGCCCCTC ACTCCCCAGA 2520 
GTCAAAAGGA AATGTAAAAA CTTAGAGCTC CCATTGTAAT GTAAGGGGCA AGAAATTTGT 2580 
GTTCTTCTGA ATGCTACTAG CAGCAOCAGC CTTGTTTTAA ATGTTTTCTT GAGCTAGAAG 2640 
AAATAGCTGA TTATTGTATA TGCAAATTAC ATGCATTTTT AAAAACTATT CTTTCTGAAC 2700 
TTATCTACCT GGTTATGATA CTGTGGGTCC ATACACAAGT AAAATAAGAT TAGACAGAAG 2760 
CCAGTATACA TTTTGCACTA TTGATGTGAT ACTGTAGOCA GCCAGGACCT TACTGATCTC 2820 
AGCATAATAA TGCTCACTAA TAATGAAGTC TGCATAGTGA CACTCATCAA GACTGAAGAT 2880 
GAAGCAGGTT ACGTGCTCCA TTGGAAGGAG TTTCTGATAG TCTCCTGCTG TTTTACCCCT 2940 
TCCATTTTTT AAAATAAGAA ATTAGCAGCC CTCTGCATAA TGTAGCTGCC TATATGCAGT 3000 
TTTATCCTGT- GCCCTAAAGC CTCACTGTCC AGAGCTGTTG GTCATCAGAT GCTTATTGCA 3060 
CCCTCACCAT GTGCCTGGTG CCCTGCTGGG TAGAGAACAC AGAGGACAGG GCATACTTCT 3120 
TGTCCTTAAG GAGCTTGTGA TCTGTGACAG TAAGCCCTCC TGGGATGTCT GTGCCATGTG 3180 
ATTGACTTAC AAGTGAAACT GTCTTATAAT ATGAAGGTCT TTTTGTTTAC TTCTAAACCC 3240 
ACTTGGGTAG TTACTATCCC CAAATCTGTT CTGTAAATAA TATTATGGAA GGGTTTCTAT 3300 
GTCAGTCTAC CTTAGAGAAA GCCAGTGATT CAATATCACA AAAGGCATTG ACGTATCTTT 3360 
GAAATGTTCA CAGCAGCCTT TTAACAACAA CTGGGTGGTC CTTGTAGGCA GAACATACTC 3420 
TCCTAAGTGG TTGTAGGAAA TTGCAAGGAA AATAGAAGGT CTGTTCTTGC TCTCAAGGAG 3480 
GTTACCTTTA ATAAAAGAAG ACAAACCCAG ATAGATATGT AAACCAAAAT ACTATGCCCC 3540 
TTAATACTTT ATAAGCAGCA TTGTTAAATA GTTCTTACGC TTATACATTC ACAGAACTAC 3600 
CCTGTTTTCC TTGTATATAA TGACTTTTGC TGGCAGAACT GAAATATAAA CTGTAAGGGG 3660 
ATTTCGTCAG TTGCTCCCAG TATACAATAT CCTCCAGGAC ATAGCCAGAA ATCTCCATTC 3720 
CACACATGAC TGAGTTCCTA TCCCTGCACT GGTACTGGCT CTTTTCTCCT CTTTCCTTGC 3780 
CTCAGGGTTC GTGCTACCCA CTGATTCCCT TTACCCTTAG TAATAATTTT GGATCATTTT 3840 
CTTTCCTTTA AAGGGGAACA AAGCCTTTTT TTTTTTTGAG ACGGAGTGTT GCTCTGTCAC 3900 
CCAAGCTGGA GTGCAGTGGC ACGATCTTGG CTCACTCCAA CCTCCACCTT CCAGGTTCAA 3960 
GTGATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACTAC GGGCACGCAC CACCACGTCT 4020 
GGCTAATTTT TGTATTTTTA GTAGAGATGG GGTTTCACCC TATTGGTCAG GCTGGTCTTG 4080 
AATTCCTCAC CTCAGGTCAT CCGCCTGTCT CGGCCTCCCG AAGTGCTGGG ATTATAGGTG 4140 
TGAGCCACCG CACCCAGTTG GGAACAAAGC CTTTTTAACA CACGTAAGGG OCCTCAAACC 4200 
GTGGGACCTC TAAGGAGACC TTTGAAGCTT TTTGAGGGCA AACTTTACCT TTGTGGTCCC 4260 
CAAATGATGG CATTTCTCTT TGAAATTTAT TAGATACTGT TATGTCCCCC AAGGGTACAG 4320 
GAGGGGCATC CCTCAGCCTA TGGGAACACC CAAACTAGGA GGGGTTATTG ACAGGAAGGA 4380 
ATGAATCCAA GTGAAGGCTT TCTGCTCTTC GTGTTACAAA CCAGTTTCAG AGTTAGCTTT 4440 
CTGGGGAGGT GTGTGTTTGT GAAAGGAATT CAAGTGTTGC AGGACAGATG AGCTCAAGGT 4500 
AAGGTAGCTT TGGCAGCAGG GCTGATACTA TGAGGCTGAA ACAATCCTTG TGATGAAGTA 4560 
GATCATGCAG TGACATACAA AGACCAAGGA TTATGTATAT TTTTATATCT CTGTGGTTTT 4620 
GAAACTTTAG TACTTAGAAT TTTGGCCTTC TGCACTACTC TTTTGCTCTT ACGAACATAA 4680 
TGGACTCTTA AGAATGGAAA GGGATGACAT TTACCTATGT GTGCTGCCTC ATTCCTGGTG 4740 
AAGCAACTGC TACTTGTTCT CTATGCCTCT AAAATGATGC TGTTTTCTCT GCTAAAGGTA 4800 
AAAGAAAAGA AAAAAATAGT TGGAAAATAA GACATGCAAC TTGATGTGCT TTTGAGTAAA 4860 
TTTATGCAGC AGAAACTATA CAATGAAGGA AGAATTCTAT GGAAATTACA AATCCAAAAC 4920 
TCTATGATGA TGTCTTCCTA GGGAGTAGAG AAAGGCAGTG AAATGGCAGT TAGACCAACA 4980 
GAGGCTTGAA GGATTCAAGT ACAAGTAATA TTTTGTATAA AACATAGCAG TTTAGGTCCC 5040 
CATAATCCTC AAAAATAGTC ACAAATATAA CAAAGTTCAT TGTTTTAGGG TTTTTAAAAA 5100 
ACGTGTTGTA CCTAAGGCCA TACTTACTCT TCTATGCTAT CACTGCAAAG GGGTGATATG 5160 
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5 
10 

15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TATGTATTAT 
TTAATACTAT 
ATCTGGACTG 
AGTATATCCT 
ACCTGTTCTT 
GAGATGACTG 
AGCGAGGCCT 
TTTTGAGTTG 
CATTTATTTT 



ATAAAAAAAA 
TTAATTTTTT 
AAGGTGTCCT 
TTCTAAACTG 
GTCTCTTTTT 
TAGCTTTTCG 
GCTCCATCGA 
ACCTGACTTC 
ATATTCTTGG 



AAACCCTTAA 
TAAAGATTTG 
TTTTAACAAC 
CCTAGTTTGT 
TCAGTCATTT 
TGCTCCACTG 
GTGCAGGACG 
CTTCTTGAAA 
TTGAAATAAA 



TGCACTGTTA 
TCTGTGTAGA 
AATTTAAAGT 
ATATTCCTAT 
TCTGCACGCA 
CGAGGTTTGT 
AGCTACTGCT 
TGACTGTTAA 
ATTTAATTGA 



TCTCCTAAAT 
CACTAAAAGT 
ACTTTTTATA 
AATTCCTATT 
TCCCCCTTTA 
GCTCAGAGCC 
TTCGAGCGAG 
AACTAAAATA 
CTTTG 



ATTTAGTAAA 
ATTACACAAA 
TATGTTATGT 
TGTGAAGTGT 
TATGGTTATA 
GCTGCACCCC 
GGTTTCCTGC 
AATTACATTG 



5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 



41 



51 



SEQID NO:78 PD03 Protein sequence: 
Protein Accession #: BAA82980 

1 11 21 31 

I I I I I I 

VKSLLYQILD GIHYLHANWV LHRDLKPANI LVMGEGPERG RVKIADMGFA RLFNSPLKPL 
ADLDPVWTF WXRAPELLLG ARHYTKAIDI WAIGC1PAEL LTSEPIFHCR QEDIKTSNPF 
HHDQLDRIFS VMGFPADKDW EDIRKMPEYP TLO.KDFRRTT YANSSLIKYM EKHKVKPDSK 
VFLLLQKLLT MDPTKRITSE QALQDPYFQE DPLPTLDVFA GCQIPYPKRE FLNEDDPEEK 
GDKNQQQQQN QHQQPTAPPQ QAAAPPQAPP PQQNSTQTNG TAGGAGAGVG GTGAGLQHSQ 
DSSLNC3VPPN KKPRLGPSGA NSGGPVMPSD YQHSSSRLNY QSSVQGSSQS QSTLGYSSSS 
QQSSQYHPSH QAHRY 

SEQ 10 NO:79 POOS DNA SEQUENCE 

Nucleic Arid Accession XM.002922 

Coding sequence: 1-2190 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 
300 
360 



1 
I 

ATGAATCCTT 
GAGGTACCAC 
AACTATCCAC 
TATGGAATGA 
ACCTCCACAT 
GCAGCCATTG 
TATGTGCTTG 
GTACACACAG 
AAACCCTGTG 
ACTAGATACT 
ATCACACCCA 
TTTGGAGTTC 
ATATACAATA 
TTTGCTATTT 
CTAGACTGGG 
AGGGTACTAT 
TCACGATGGA 
CCGGACCAGA 
TTTGTCATTT 
GCTGTTGGTA 
ATAAATGAAA 
CTGGCAGATG 
GAGTCCATCA 
AGCCAGGATT 
GTGCAGGAGA 
ATGATGGTAA 
AACACTTTGC 
GAAGACTATG 
TGTAGAACAG 
TATCTGTTTG 
ATTCCAGCCA 
GGGGAGGTCA 
ATGAAATCTG 
CTTGTTGTGG 
CTCCTGCTGG 
ACAGAGGATA 
AAACTAGAGA 



11 
I 

TCCAGAAAAA 
CTCGACCACC 
TGAGCATTGC 
AAGCTGTGCT 
CTATATACCA 
CTGACTCGTG 
GCCATGTGAT 
TCCTATCATT 
TGGCAGCTTT 
TCTCAGTCTT 
TGCTGAGAGG 
CAGGACTGCT 
AACCACCCCC 
CCAATCGTTT 
CAGCTGAGAA 
TCCTTTATAT 
CTTTGCAAGC 
TGCAGGTTCT 
ATCGTCTGGT 
TGATCCTAGC 
TGGCCCCAGC 
ATGAGGTGAA 
AATOCTTTCA 
TTCACTTCCA 
AGAACTGGTA 
AGGATACAGA 
ATAAAGATGT 
GTGTGTCTGC 
AAGATAAGAA 
TTATTACTAA 
ACAAAATGTC 
TGTTCTCTGT 
TGCTCCAGGC 
CACAGTTCAG 
TGATCTGCCT 
TGCGGGGTCC 
CCAAGAAGAC 



21 
I 

TGAGTCCAAG 
TAGOCCTCCA 
CTTCATTGTG 
GATCCTGTAT 
TGCCTTCAGC 
GTTGGGAAAA 
CAAGTCCTTG 
GATCGGCCTG 
TGGTGGAGAC 
CTACCTGTCC 
AGATGTGCAA 
CATGGTAATT 
TGAAGGAAAC 
CAAGAACCGT 
ATATCCAAAG 
CCCATTGOCC 
CATCAGGATG 
AAATCCCTTT 
CTCCAAGTGT 
GTGCCTGGCA 
CCAGTCAGGT 
GGTGACAGTG 
GAAAACACCA 
CCTGAAATAT 
CAGTCTTGTC 
AAGCAAAACA 
CAACATCTCC 
TTATAGAACT 
CTTTTCTCTG 
TAACACCAAT 
CATTGCGTGG 
CACAGGTCTT 
AGCTTGGCTA 
TGGCCTGGTA 
GATCTTCTCC 
AGCAGATAAG 
AAAACTCTGA 



SEQ 10 NO:80PPQ$ Protein, sequence; 
Protein Accession*: XP_002922 



I 

MNPFQKNESK 
YGMKAVLILY 
YVLGHVIKSL 
TRYFSVFYLS 
IYNKPPPEGN 
RVLFLYIPLP 
FVIYRLVSKC 
LADDEVKVTV 



11 
I 

ETLFSPVSIE 
FLYFLHWNED 
GALPILGGQV 
INAGSLISTP 
IVAQVFKCIW 
KFWAXjLDQQG 
GINFSSLRKM 
VGNENNSLLI 



21 
I 

EVPPRPPSPP 
TSTSIYHAFS 
VHTVLSLIGL 
ITPMLRGDVQ 
FAISNRFKNR 
SRWTLOAIRM 
AVGMILACLA 
ESIKSFQKTP 



31 
I 

GAAACTCTTT 
AAGAAGCCAT 
GTGAATGAAT 
TTCCTGTATT 
AGCCTCTGTT 
TTCAAGACAA 
GGTGCCTTAC 
AGTCTAATAG 
CAGTTTGAAG 
ATCAATGCAG 
TGTTTTGGAG 
GCACTTGTTG 
ATAGTGGCTC 
TCTGGAGACA 
CAGCTCATTA 
ATGTTCTGGG 
AATAGGAATT 
CTGGTTCTTA 
GGAATTAACT 
TTTGCAGTTG 
CCCCAGGAGG 
GTGGGAAATG 
CACTATTCCA 
CACAATTTGT 
ATTCGTGAAG 
ACCAATGGGA 
CTGAGTACAG 
GTGCAAAGAG 
AATTTGGGTC 
CAGGGTCTTC 
CAGCTACCAC 
GAGTTTTCTT 
TTGACAATTG 
CAGTGGGCCG 
ATCATGGGCT 
CACATTCCTC 



31 
I 

KKPSPTICGS 
SLCYFTPILG 
SLIALGTGGI 
CFGEDCYALA 
SGDIPKRQHW 
NRNLGFFVLQ 
FAVAAAVEIK 
HYSKLHLKTK 



41 
I 

TTTCACCTGT 
CTCCGACAAT 
TCTGCGAGCG 
TCCTGCACTG 
ATTTTACTCC 
TCATCTATCT 
CAATACTGGG 
CTTTGGGGAC 
AAAAACATGC 
GGAGCTTGAT 
AAGACTGCTA 
TGTTTGCAAT 
AAGTTTTCAA 
TTCCAAAGCG 
TGGATGTAAA 
CTCTTTTGGA 
TGGGGTTTTT 
TCTTCATCCC 
TCTCATCACT 
CGGCAGCTGT 
TTTTCCTACA 
AAAACAATTC 
AACTGCACCT 
CTCTCTACAC 
ATGGGAACAG 
TGACAACCGT 
ATACCTCTCT 
GAGAATACCC 
TTCTAGACTT 
AGGCCTGGAA 
AATATGCCCT 
ATTCTCAGGC 
CAGTTGGGAA 
AATTCATTTT 
ACTACTATGT 
ACATCCAGGG 



51 
I 

CTCCATTGAA 

CTGTGGCTCC 

CTTTTCCTAT 

GAATGAAGAT 

CATCCTGGGA 

CTCCTTGGTG 

AGGACAAGTG 

AGGAGGCATC 

AGAGGAACGG 

TTCTACATTT 

TGCATTGGCT 

GGGAAGCAAA 

ATGTATCTGG 

ACAGCACTGG . 

GGCACTGACC 

TCAGCAGGGT 

TGTGCTTCAG 

GTTGTTTGAC 

TAGGAAAATG 

AGAGATAAAA 

AGTCTTGAAT 

TCTGTTGATA 

GAAAACAAAA 

TGAGCATTCT 

TATCTCCAGC 

GAGGTTTGTT 

CAATGTTGGT 

TGCAGTGCAC 

TGGTGCAGCA 

GATTGAAGAC 

GGTTACAGCT 

TCCCTCTAGC 

TATCATCGTG 

GTTTTCCTGC 

TCCTGTAAAG 

GAACATGATC 



41 



51 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



NYPLSIAFIV VNEFCERFSY 
AAIADSWLGK FKTIIYLSLV 
KPCVAAFGGD QFEEKHAEER 
FGVPGLLMVI ALWPAMGSK 
LDWAAEKYPK QLIMDVKALT 
PDQKQVLNPF LVLIFIPLFD 
INEMAPAQSG PQEVFLQVLN 
SQDFHFHLKY HNLSLYTEHS 



60 
120 
180 
240 
300 
360 
420 
480 
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10 



35 



VQEKNWYSLV IREDGNSISS MMVKDTESKT TNGMTTVRFV NTLHKDVNIS LSTDTSLNVG 540 
EDYGVSAYRT VQRGEYPAVH CRTEDKNFSL NLGLLDFGAA YLFVITNNTO GGLQAWKIED 600 
IPANKMSIAW QLPQYALVTA GEVMFSVTGL EFSYSQAPSS MKSVLQAAWL LTIAVGNIIV 660 
LWAQFSGLV GWAEFILFSC LLLVICLIFS IMGYYYVPVK TEDMRGPADK RTPHICGNHI 720 
KLETKKTKL 

SEQ ID NO:81 P006 DNA SEQUENCE 

Nucleic Acid Accession #: NMJE0448 

Coding sequence: 1-1221 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGACGGAT CCCACAGCGC AGCCCTGAAG CTGCAGCAGC TGCCTCCCAC AAGTAGCTCC 60 

- AGCGCCGTAA GCGAGGCCTC CTTCTCCTAC AAGGAAAACC TGATTGGCGC CCTCTTGGCG 120 

15 ATCTTCGGGC ACCTCGTGGT CAGCATTGCA CTTAACCTCC AGAAGTACTG CCACATCCGC 180 

CTGGCAGGCT CCAAGGATCC CCGGGCCTAT TTCAAGACCA AGACATGGTG GCTGGGCCTG 240 

TTCCTGATGC TTCTGGGCGA GCTGGGTGTG TTCGCCTCCT ACGCCTTCGC GCCGCTGTCA 300 

CTCATCGTGC CCCTCAGCGC AGTTTCTGTG, ATAGCTAGTG CCATCATAGG AATCATATTC 360 

n ATCAAGGAAA AGTGGAAACC GAAAGACTTT CTGAGGCGCT ACGTCTTGTC CTTTGTTGGC 420 

20 TGCGGTTTGG CTGTCGTGGG TACCTACCTG CTGGTGACAT TCGCACCCAA CAGTCACGAG 480 

AAGATGACAG GCGAGAATGT CACCAGGCAC CTCGTGAGCT GGCCTTTCCT TTTGTACATG 540 

CTGGTGGAGA TCATTCTGTT CTGCTTGCTG CTCTACTTCT ACAAGGAGAA GAACGCCAAC 600 

AACATTGTCG TGATTCTTCT CTTGGTGGCG TTACTTGGCT CCATGACAGT GGTGACAGTC 660 

c AAGGCCGTGG CTGGGATGCT TGTCTTGTCC ATTCAAGGGA ACCTGCAGCT TGACTACCCC 720 

25 ATCTTCTACG TGATGTTCGT GTGCATGGTG GCAACCGCCG TCTATCAGGC TGCGTTTTTG 780 

AGTCAAGCCT CACAGATGTA CGACTCCTCT TTGATTGCCA GTGTGGGCTA CATTCTGTCC 840 

ACAACCATTG CTATCACAGC AGGTGCAATA TTTTACCTGG ACTTCATCGG GGAGGACGTG 900 

CTGCACATCT GCATGTTTGC ACTGGGGTGC CTCATTGCAT TCTTGGGCGT CTTCTTAATC 960 

ACGCGTAACA GGAAGAAGCC CATTCCATTT GAGCCCTATA TTTCCATGGA TGCCATGCCA 1020 

30 GGTATGCAGA ACATGCACGA TAAAGGGATG ACTGTCCAGC CTGAACTTAA AGCTTCTTTT 1080 

TCCTATGGGG CTCTGGAAAA CAATGACAAC ATTTCTGAGA TCTACGCTCC TGCCACCCTG 1140 

CCAGTCATGC AAGAAGAGCA CGGCTCCAGA AGTGCCTCTG GGGTCCCCTA CCGAGTCCTA 1200 
GAGCACACCA AGAAGGAATG A 



SEQ ID N0:B2 PD06 Protein sequence 
Protein Accessions NPJ)6518I 



1 11 21 31 41 51 

AfX I ! I I II 

40 MDGSHSAALK LQQLPPTSSS SAVSEASFSY KENLIGALLA IFGHLWSIA LNLQKYCHIR 60 

LAGSKDPRAY FKTKTWWLGL FLMLLGELGV FASYAFAPLS LIVPLSAVSV IASAIIGIIF 120 

IKEKWKPKDF LRRYVLSFVG CGLAWGTYL LVTFAPNSHE KMTGENVTRH LVSWPFLLYM 180 

LVEIILFCLL LYFYKEKNAN NIWILLLVA LLGSMTWTV KAVAGMLVLS IQGNLQLDYP 240 

- IFYVMFVCMV ATAVYQAAFL SQASQMYDSS LIASVGYILS TTIAITAGAI FYLDFIGEDV 300 

45 LHICMFALGC LIAFLGVFLI TRNRKKPIPF EPYISMDAKP GMQNMHDKGM TVQPELKASF 360 
SYGALENNDN ISEIYAPATL PVMQEEHGSR SASGVPYRVL EHTKKE 

SEQ ID NO:83 P008 DNA SEQUENCE 

Nucleic Acid Accession #: NMJJ32712 
50 Coding sequence: 555-908 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I i 

CACTCATTAA GAACAGAGGA GGCTGCCTGT TACTCCTGGT GTTGCATCCC TCCAGACACT 60 

55 CTGCTGTTTC CTGCCTAGGC GTGGCTGCAG CCATGGCTAG GAAAGCGCTG CCACCCACCC 120 

ACCTGGGCCA GAGCTGGTTC TGCTCCTGCT GCAGGGACAC TGAGCTGGCT ATCTCGGCGC 180 

TTCGGGCAAG AACTGCAACA GGCTCTCCTG GGTCCTGCAG GTGTACAGCC GGGCCCCTGC 240 

CTTGTGCCTC AGCTCTCGAG AGCTGCTGCT GCCGGGTGAC CTGATCCAAC CTGATAAGGT 300 

_ n GCCATCTTCA GCTACCACTG CAAGGCCCTG AGGGCAAGAG CAGCACGGCA CTGCCCACCC 360 

60 GGCTGCTGAT GGCCTGGTGC CAGCTGGGAG TCCTCCCGGC ACTTCGAGGC CACTGAGCCA 420 

CCCTTCCAGC CCCAGCCCAC CATGGACAGG GGTATCCAGC TTCCTCCTCA ACCTWTCCT 480 

CTGCCCCTGA GCCAGTGACG CCCAAGGACA TGCCTGTTAC CCAGGTCCTG TACCAGCACT 540 

AGCTGGTCAA GGGCATGACA GTGCTGGAGG CCGTCTTGGA GATCCAGGCC ATCACTGGCA 600 

/:c GCAGGCTGCT CTCCATGGTG CCAGGGCCCG CCAGGCCACC AGGCTCATGC TGGGACCCAA 660 

65 CCCAGTGCAC AAGGACTTGG CTGCTGAGCC ACACACCCAG GAGAAGGTGG ATAAGTGGGC 720 

TACCAAGGGC TTCCTGCAGG CTAGGGGAGG AGCCACCCCC GCTTCCCTAT TGTGACCAGG 780 

CCTATGGGGA GGAGCTGTCC ATACGCCACC GTGAGACCTG GGCCTGGCTC TCAAGGACAG 840 

ACACCGCCTG GCCTGGTGCT CCAGGGGTGA AGCAGGCCAG AATCCTGGGG GAGCTGCTCC 900 

_ A TGGTTTGAGC TGCATTCAGG AAGTGCGGGA CATGGTAGGG GAGGCAAAAA GCCTTGGGCA 960 

70 CTACCCTCCC TGTGGAGCTG TTCGGTGTCC GTCGAGCTAG CCACACCCTG ACACCATGTT 1020 

CAAGGGTACC GGAAGAGAAG GGTGTCTGCC CCCAACCTCC CCTGTGGGTG TCACTGGCCA 1080 

GATGTCATGA GGGAAGCAGG CCTTGTGAGT GGACACTGAC CATGAGTCCC TGGGGGGAGT 1140 

GATCCCCCAG GCATCGTGTG CCATCTTGCA CTICTGCCCA GGCAGCAGGG TGGGTGGGTA 1200 

CCATGGGTGC CCACCCCTCC ACCACATGGG GCCCCAAAGC ACTGCAGGCC AAGCAGGGCA 1260 
75 ACCCCACACC CTTGACATAA AAGCATCTTG AAGCTTTTAA AAAAAAAAAA AAAAAA 

SEQ ID NQ:B4 PD08 Protein sequence 
Protein Accession*: NPJ16101 

80 1 11 21 31 41 51 
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MTVLEAVLEI QAITGSRLLS MVPGPARPPG SCWDPTQCTR TWLLSHTPRR RWISGLPRAS 60 
CRLGEEPPPL PYCDQAYGEE LSIRHRETWA WLSRTDTAWP GAPGVKQARI LGELLLV 

SEQ ID NO:85 P0T1 DNA SEQUENCE 

Nucleic Acid Accession*: NMJ500693 

Codng sequence: 53-1 59 1 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AGCCGGTGCG CCGCAGACTA GGGCGCCTCG GGCCAGGGAG CGCGGAGGAG CCATGGCCAC 60 

CGCTAACGGG GCCGTGGAAA ACGGGCAGCC GGACGGGAAG CCGCCGGCCC TGCCGCGCCC 120 

CATCCGCAAC CTGGAGGTCA AGTTCACCAA GATATTTATC AACAATGAAT GGCACGAATC 180 

CAAGAGTGGG AAAAAGTTTG CTACATGTAA CCCTTCAACT CGGGAGCAAA TATGTGAAGT 240 

GGAAGAAGGA GATAAGCCCG ACGTGGACAA GGCTGTGGAG GCTGCACAGG TTGCCTTCCA 300 

GAGGGGCTCG CCATGGCGCC GGCTGGATGC CCTGAGTCGT GGGCGGCTGC TGCACCAGCT 360 

GGCTGACCTG GTGGAGAGGG ACCGCGCCAC CTTGGCCGCC CTGGAGACGA TGGATACAGG 420 

GAAGCCATTT CTTCATGCTT TTTTCATCGA CCTGGAGGGC TGTATTAGAA CCCTCAGATA 480 

CTTTGCAGGG TGGGCAGACA AAATCCAGGG CAAGACCATC CCCACAGATG ACAACGTCGT 540 

ATGCTTCACC AGGCATGAGC CCATTGGTGT CTGTCGGGCC ATCACTCCAT GGAACTTCCC 600 

CCTGCTGATG CTGGTGTGGA AGCTGGCACC CGCCCTCTGC TGTGGGAACA CCATGGTCCT 660 

GAAGCCTGCG GAGCAGACAC CTCTCACCGC CCTTTATCTC GGCTCTCTGA TCAAAGAGGC 720 

CGGGTTCCCT CCAGGAGTGG TGAACATTGT GCCAGGATTC GGGCCCACAG TGGGAGCAGC 780 

AATTTCTTCT CACCCTCAGA TCAACAAGAT CGCCTTCACC GGCTCCACAG AGGTTGGAAA 840 

ACTGGTTAAA GAAGCTGCGT CCCGGAGCAA TCTGAAGCGG GTGACGCTGG AGCTGGGGGG 900 

GAAGAACCCC TGCATCGTGT GTGCGGACGC TGACTTGGAC TTGGCAGTGG AGTGTGCCCA 960 

TCAGGGAGTG TTCTTCAACC AAGGCCAGTG TTGCACGGCA GCCTCCAGGG TGTTCGTGGA 1020 

GGAGCAGGTC TACTCTGAGT TTGTCAGGCG GAGCGTGGAG TATGCCAAGA AACGGCCCGT 1080 

GGGAGACCCC TTCGATGTCA AAACAGAACA GGGGCCTCAG ATTGATCAAA AGCAGTTCGA 1140 

CAAAATCTTA GAGCTGATCG AGAGTGGGAA GAAGGAAGGG GCCAAGCTGG AATGCGGGGG 1200 

CTCAGCCATG GAAGACAAGG GGCTCTTCAT CAAACCCACT GTCTTCTCAG AAGTCACAGA 1260 

CAACATGCGG ATTGCCAAAG AGGAGATTTT CGGGCCAGTG CAACCAATAC TGAAGTTCAA 1320 

AAGTATCGAA GAAGTGATAA AAAGAGCGAA TAGCACCGAC TATGGACTCA CAGCAGCCGT 1380 

GTTCACAAAA AATCTCGACA AAGCCCTGAA GTTGGCTTCT GCCTTAGAGT CTGGAACGGT 1440 

CTGGATCAAC TGCTACAACG CCCTCTATGC ACAGGCTCCA TTTGGTGGCT TTAAAATGTC 1500 

AGGAAATGGC AGAGAACTAG GTGAATACGC TTTGGCCGAA TACACAGAAG TGAAAACTGT 1560 

CACCATCAAA CTTGGCGACA AGAACCC CTG AA GGAAAGGC GGGGCTCCTT CCTCAAACAT 1620 

CGGACGGCGG AATGTGGCAG ATGAAATGTG CTGGAGGAAA AAAATGACAT TTCTGACCTT 1680 

CCCGGGACAC ATTCTTCTGG AGGCTTTACA TCTACTGGAG TTGAATGATT GCTGTTTTCC 1740 

TCTCACTCTC CTGTTTATTC ACCAGACTGG GGATGCCTAT AGGTTGTCTG TGAAATCGCA 1800 

GTCCTGCCTG GGGAGGGAGC TGTTGGCCAT TTCTGTGTTT CCCTTTAAAC CAGATCCTGG 1860 

AGACAGTGAG ATACTCAGGG CGTTGTTAAC AGGGAGTGGT ATTTGAAGTG TCCAGCAGTT 1920 

GCTTGAAATG CTTTGCCGAA TCTGACTCCA GTAAGAATGT GGGAAAACCC CCTGTGTGTT 1980 

CTGCAAGCAG GGCTCTTGCA CCAGCGGTCT CCTCAGGGTG GACCTGCTTA CAGAGCAAGC 2040 

CACGCCTCTT TCCGAGGTGA AGGTGGGACC ATTCCTTGGG AAAGGATTCA CAGTAAGGTT 2100 

TTTTGGTTTT TGTTTTTTGT TTTCTTGTTT TTAAAAAAAG GATTTCACAG TGAGAAAGTT 2160 

TTGGTTAGTG CATACCGTGG AAGGGCGCCA GGGTCTTTGT GGATTGCATG TTGACATTGA 2220 

CCGTGAGATT CGGCTTCAAA CCAATACTGC CTTTGGAATA TGACAGAATC AATAGCCCAG 2280 

AGAGCTTAGT CAAAGACGAT ATCACGGTCT ACCTTAACCA AGGCACTTTC TTAAGCAGAA 2340 

AATATTGTTG AGGTTACCTT TGCTGCTAAA GATCCAATCT TCTAACGCCA CAACAGCATA 2400 

GCAAATCCTA GGATAATTCA CCTCCTCATT TGACAAATCA GAGCTGTAAT TCACTTTAAC 2460 

AAATTACGCA TTTCTATCAC GTTCACTAAC AGCTTATGAT AAGTCTGTGT AGTCTTCCTT 2520 

TTCTCCAGTT • CTGTTACCCA ATTTAGATTA GTAAAGCGTA CACAACTGGA AAGACTGCTG 2580 

TAATAACACA GCCTTGTTAT TTTTAAGTCC TATTTTGATA TTAATTTCTG ATTAGTTAGT 2640 

AAATAACACC TGGATTCTAT GGAGGACCTC GGTCTTCATC CAAGTGGCCT GAGTATTTCA 2700 

CTGGCAGGTT GTGAATTTTT CTTTTCCTCT TTGGGAATCC AAATGATGAT GTGCAATTTC 2760 

ATGTTTTAAC TTGGGAAACT GAAAGTGTTC CCATATAGCT TCAAAAACAA AAACAAATGT 2820 

GTTATCCGAC GGATACTTTT ATGGTTACTA ACTAGTACTT TCCTAATTGG GAAAGTAGTG 2880 

CTTAAGTTTG CAAATTAAGT TGGGGAGGGC AATAATAAAA TGAGGGCCCG TAACAGAACC 2940 

AGTGTGTGTA TAACGAAAAC CATGTATAAA ATGGGCCTAT CACCCTTGTC AGAGATATAA 3000 

ATTACCACAT TTGGCTTCCC TTCATCAGCT AACACTTATC ACTTATACTA CCAATAACTT 3060 

GTTAAATCAG GATTTGGCTT CATACACTGA ATTTTCAGTA TTTTATCTCA AGTAGATATA 3120 

GACACTAACC TTGATAGTGA TACGTTAGAG GGTTCCTATT CTTCCATTGT ACGATAATGT 3180 

CTTTAATATG AAATGCTACA TTATTTATAA TTGGTAGAGT TATTGTATCT TTTTATAGTT 3240 

GTAAGTACAC AGAGGTGGTA TATTTAAACT TCTGTAATAT ACTGTATTTA GAAATGGAAA 3300 

TATATATAGT GTTAGGTTTC ACTTCTTTTA AGGTTTACCC CTGTGGTGTG GTTTAAAAAT 3360 

CTATAGGCCT GGGAATTCCG ATCCTAGCTG CAGATCGCAT CCCACAATGC GAGAATGATA 3420 
AAATAAAATT GGATATTTGA GA 

SEP ID NO;86 PDT1 PROTEIN SEQUENCE 

Protein Accession*: NPJW0684 

1 11 21 31 41 51 

I I I I I I 

MATANGAVEN GQPDGKPPAL PRPIRNLEVK FTKIFINNEW HESKSGKKFA TCNPSTREQI 60 

CEVEEGDKPD VDKAVEAAQV AFQRGSPWRR LDALSRGRLL HQLADLVERD RATLAALETM 120 

DTGKPFLHAF FIDLEGCIRT LRYFAGWADK IQGKTIPTDD NWCFTRHEP IGVCGAITPW 180 

NFPLLMLVWK LAPALCCGNT HVLKPAEQTP LTALYLGSLI KEAGFPPGW NIVPGFGPTV 240 

GAAISSHPQI NKIAFTGSTE VGKLVKEAAS RSNLKRVTLE LGGKNPCIVC ADADLDLAVE 300 
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CAHQGVFFNQ GQCCTAASRV FVEEQVYSEF VRRSVEYAKK RPVGDPFDVK TEQGPQIDQK 360 

QFDKILELIE SGKKEGAKLE CGGSAMEDKG LFIKPTVFSE VTDNMRIAKE EIFGPVQPIL 420 

KFKSIEEVIK RANSTDYGLT AAVFTKNLDK ALKLASALES GTVWINCYNA LYAQAPFGGF 480 
^ KMSGNGRELG EYALAEYTEV KTVTIKLGDK NP 

SEQ 10 NO:87 POV3 ONA SEQUENCE 

Nucleic Acid Accession*: NM_032642 

Coding sequence: 184-1263 (underlined sequences correspond to start and stop codons) 

10 1 11 21 31 41 51 

I I I I I I 

GACCATTAGC AGGCACCCAG GCCTGTCTTT GGCTCGGAAA CGGTGGCCCC CAATGTAGCC 60 

TAGTTTGAAC CTAGGAACTG CAGGACCAGA GAGATTCCAC TGGAGCCTGA TGGACGGGTG 120 

ACAGAGGGAA CCCTACTCTG GAAACTGTCA GTCCCAGGGC ACTGGGGAGG GCTGAGGCCG 180 

15 ACCATGCCCA GCCTGCTGCT GCTGTTCACG GCTGCTCTGC TGTCCAGCTG GGCTCAGCTT 240 

CTGACAGACG CCAACTCCTG GTGGTCATTA GCTTTGAACC CGGTGCAGAG ACCCGAGATG 300 

TTTATCATCG GTGCCCAGCC CGTGTGCAGT CAGCTTCCCG GGCTCTCCCC TGGCCAGAGG 360 

AAGCTGTGCC AATTGTACCA GGAGCACATG GCCTACATAG GGGAGGGAGC CAAGACTGGC 420 

A ATCAAGGAAT GCCAGCACCA GTTCCGGCAG CGGCGGTGGA ATTGCAGCAC AGCGGACAAC 480 

20 GCATCTGTCT TTGGGAGAGT CATGCAGATA GGCAGCCGAG AGACCGCCTT CACCCACGCG 540 

GTGAGCGCCG CGGGCGTGGT CAACGCCATC AGCCGGGCCT GCCGCGAGGG CGAGCTCTCC 600 

ACCTGCGGCT GCAGCCGGAC GGCGCGGCCC AAGGACCTGC CCCGGGACTG GCTGTGGGGC 660 

GGCTGTGGGG ACAACGTGGA GTACGGCTAC CGCTTCGCCA AGGAGTTTGT GGATGCCCGG 720 

GAGCGAGAGA AGAACTTTGC CAAAGGATCA GAGGAGCAGG GCCGGGTGCT CATGAACCTG 780 

25 CAAAACAACG AGGCCGGTCG CAGGGCTGTG TATAAGATGG CAGACGTAGC CTGCAAATGC 840 

CACGGCGTCT CGGGGTCCTG CAGCCTCAAG ACCTGCTGGC TGCAGCTGGC CGAGTTCCGC 900 

AAGGTCGGGG ACCGGCTGAA GGAGAAGTAC GACAGCGCGG CCGCCATGCG CGTCACCCGC 960 

AAGGGCCGGC TGGAGCTGGT CAACAGOCGC TTCACCCAGC CCACCCCGGA GGACCTGGTC 1020 

TATGTGGACC CCAGCCCCGA CTACTGCCTG CGCAACGAGA GCACGGGCTC CCTGGGCACG . 1080 

30 CAGGGCCGCC TCTGCAACAA GACCTCGGAG GGCATGGATG GCTGTGAGCT CATGTGCTGC 1140 

GGGCGTGGCT ACAACCAGTT CAAGAGCGTG CAGGTGGAGC GCTGCCACTG CAAGTTCCAC 1200 

TGGTGCTGCT TOGTCAGGTG TAAGAAGTGC ACGGAGATCG TGGACCAGTA CATCTGTAAA 1260 

TAGCCCGGAG GGCCTGCTCC CGGCCCCCCC TGCACTCTGC CTCACAAAGG TCTATATTAT 1320 

ATAAATCTAT ATAAATCTAT TTTATATTTG TATAAGTAAA TGGGTGGGTG CTATACAATG 1380 

35 GAAAGATGAA AATGGAAAGG AAGAGCTTAT TTAAGAGACG CTGGAGATCT CTGAGGAGTG 1440 

GACTTTGCTG GTTCTCTCCT CTTGGTGGGT GGGAGACAGG GCTTTTTCTC TCCCTCTGGC 1500 

GAGGACTCTC AGGATGTAGG GACTTGGAAA TATTTACTGT CTGTCCACCA CGGCCTGGAG 1560 

GAGGGAGGTT GTGGTTGGAT GGAGGAGATG ATCTTGTCTG GAAGTCTAGA GTCTTTGTTG 1620 

GTTAGAGGAC TGCCTGTGAT CCTGGCCACT AGGCCAAGAG GCCCTATGAA GGTGGCGGGA 1680 

40 ACTCAGCTTC AACCTCGATG TCTTCAGGGT CTTGTCCAGA ATGTAGATGG GTTCCGTAAG 1740 

AGGCCTGGTG CTCTCTTACT CTTTCATCCA CGTGCACTTG TGCGGCATCT GCAGTTTACA 1800 

GGAACGGCTC CTTCCCTAAA ATGAGAAGTC CAAGGTCATC TCTGGCCCAG TGACCACAGA 1860 

GAGATCTGCA CCTCCCGGAC TTCAGGCCTG CCTTTCCAGC GAGAATTCTT CATCCTCCAC 1920 

GGTTCACTAG CTCCTACCTG AAGAGGAAAG GGGGCCATTT GACCTGACAT GTCAGGAAAG 1980 

45 CCCTAAACTG AATGTTTGCG CCTGGGCTGC AGAAGCCAGG GTGCATGACC AGGCTGCGTG 2040 

GACGTTATAC TGTCTTCCCC CACCCCCGGG GAGGGGAAGC TTGAGCTGCT GCTGTCACTC 2100 

CTCCACCGAG GGAGGCCTCA CAAACCACAG GACGCTGCAA CGGGTCAGGC TGGCGGGCCC 2160 

GGCGTGCTCA TCATCTCTGC CCCAGGTGTA CGGTTTCTCT CTGACATTAA ATGCCCTTCA 2220 
TGGAAAAAAA AAAAAGAAAA AAAAAAAAAA AA 
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SEQtONO:88£ 
Protein Accession* NP_116031 

1 11 . 21 31 41 51 

I 1 I I I I 

MPSLLLLFTA ALLSSWAQLL TDANSWWSLA LNPVQRPEMF IIGAQPVCSQ LPGLSPGQRK 60 
LCQLYQEHMA YIGEGAKTGI KECQHQFRQR RWNCSTADNA SVFGRVMQIG SRETAFTHAV 120 
SAAGWNAIS RACREGELST CGCSRTARPK DLPRDWLWGG CGDNVEYGYR FAXEFVDARE 180 
REKNFAKGSE EQGRVLMNLQ NNEAGRRAVY KMADVACKCH GVSGSCSLKT CWLQLAEFRK 240 
VGDRLKEKYD SAAAMRVTRK GRLELVNSRF TQPTPEDLW VDPSPDYCLR NESTGSLGTQ 300 
GRLCNKTSEG HDGCELHCCG RGYNQFKSVQ VERCHCKFHW CCFVRCKKCT EXVDQY1CK* 



SEQ ID NO:89 P0T9 DNA SEQUENCE 

Nucleic Add Accession*: NMJ)33280 
65 Coding sequence: 53-636 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGCAGCCGTC TGTGCCACCC AGAGCCGGCG GGCCGCTAGG TCCCCGGAGA CCCTGCTATG 60 

70 GTGCGTGCGG GCGCCGTGGG GGCTCATCTC CCCGCGTCOG GCTTGGATAT CTTCGGGGAC 120 

CTGAAGAAGA TGAACAAGCG CCAGCTCTAT TACCAGGTTT TAAACTTCGC CATGATCGTG 180 

TCTTCTGCAC TCATGATATG GAAAGGCTTG ATCGTGCTCA CAGGCAGTGA GAGCCCCATC 240 

GTGGTGGTGC TGAGTGGCAG TATGGAGCCG GCCTTTCACA GAGGAGACCT CCTGTTCCTC 300 

ACAAATTTCC GGGAAGACCC AATCAGAGCT GGTGAAATAG TTGTTTTTAA AGTTGAAGGA 360 

75 CGAGACATTC CAATAGTTCA CAGAGTAATC AAAGTTCATG AAAAAGATAA TGGAGACATC 420 

AAATTTCTGA CTAAAGGAGA TAATAATGAA GTTGATGATA GAGGCTTGTA CAAAGAAGGC 480 

CAGAACTGGC TGGAAAAGAA GGACGTGGTG GGAAGAGCAA GAGGGTTTTT ACCATATGTT 540 

GGTATGGTCA CCATAATAAT GAATGACTAT CCAAAATTCA ACTATGCTCT TTTGGCTGTA 600 

ATGGGTGCAT ATGTGTTACT AAAACGTGAA TCCTAAAATG AGAAGCAGTT CCTGGGACCA 660 

80 GATTGAAATG AATTCTGTTG AAAAAGAGAA AAACTAATAT ATTTGAGATG TTCCATTTTC 720 
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TGTATAAAAG GOAACAGTGT G6AGATGTTT TTGTCTTGTC CAAATAAAAG ATTCACCAGT 780 
AAAAAAAAAA AAAA 

SEQ © NO:90 EDiafrflaSSflUSDCe 
Protein Accession #: NP.150596 

1 11 21 31 41 51 

I I I I I I 

MVRAGAVGAH LPASGLDIFG DLKKMNKRQL YYQVLNFAMI VSSALMIWKG LIVLTGSESP 60 

IVWLSGSME PAFHRGDLLF LTNFREDPIR AGEIWFKVE GRDIPIVHRV IKVHEKDNGD 120 

IKFLTKGDNN EVDDRGLYKE GQNWLEKKDV VGRARGFLPY VGMVTIIMND YPKFKYALLA 180 
VMGAYVLLKR ES 

SEQ 10 N0:91 PDV5 DNA SEQUENCE 

Nucleic Acid Accession* NMJH6590 

Coding sequence: 691-975 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GATTACTCAC * ACAGTCTTGA AGATGCAATG TCAGCTATTT AGGACAGAAA CATCCAAGGC 60 

CGTGTCAGAA CTCAATTACG ACTACATATG CATTAAGGCA GGAACTGGCA GGCCTCAGGG 120 

TACGCCAACT ATAGGACTCG TGCTTCTCGT ACGCTGGGCT ATAATCTATG AAACTGAGCT 180 

CCAGAGCCAG CCAATCACTT AGCTCCTCAT AACAAGTCTA ACTGGCTCTG GAAAGCTGAA 240 

AGGGCTGCAC TGGAACAACA CAGATGAGAT ATTCTACACA TTAATCTACT TATCTGGAAT 300 

CACTTTGCCT CTAAAGGCCA GAGAAAAATC ACAGCTTCCT TGTCGGAGGG GAAAAGGACA 360 

GGTGATCTGG GGAAAACGCA GCTACACCTG GAGCAAGGTC TCTTCCCGGC TTGGCAATCT 420 

CAGCTGTGCC GGCGCTACGG GACCCGAGCC GTCCCAGAAA CCAAAGGGCA GGCACGGCAG 480 

CAAACGOCTG AGTGCTGCTG CCTTCGGTGA CTATATGAGA ATGGAAACTT CTAAGGAAGC 540 

CAGGTTGTTA GAATTGTTAC CCCCTTTACT CAGAGATAAC ATAGATTATC CAGGCTGAGA 600 

TGGAAAACAA GCCCTTTATT GAATTTTCAA CACAGACTCC CTGCTTCTCA TCTCCTTAAT 660 

AAAATTTCAT TAAAATCCCC TTGAACTCCC ATGTTCAAAT CTCCATTTGT TGACAGACAA 720 

AGCCAACAAT ACTCTAAACT GAGGCCTGCA AGTCATTTCA TTTGTATTTT TGTCCAGAAA 780 

TTTCCCATAG GAAGACTTCA CCTCCTACAA CTCCGAAGAA AACCCTTACT GTCCAAGACC 840 

GTCACCAGCA ACCATCCGCA GTCATTCAAG TGGAAGCTTT CACAGCTTTT GTACATTCTC 900 

TCTCTCAATA TACAACTGAG TTACAGACTG TCCCCTGGCT CCCTGACCCT TACAAACACT 960 

AAAAGTTTTG TTTGACTCAA CTTCAAGCTG CTCATCTGTT AGTAAGTGAT GTTCACTCCA 1020 

GAACACATTC ATGATGAGAA CTTTCTAAAA GACCAGCACT GCTCTTCCCC TCCTATAATC 1080 

ATAATAATCA TGATAACCTG AAACATGTTA CTGGGACTCG ACATTTTTCT GGGGATTGAA 1140 

ATCTTTAGTC CTTGGAGCTG TCACATAGCA GGGGCAACCT CACACTGAAA CAAAGGAAGT 1200 

GATCTCCCAT TATTATCCAC CCTGAGCCAC CATAATATGC TGTTTACATT TATTTTCTTC 1260 

AGCCTGTGCA AAACAAAGCA ATGGAAAAGG AAACTAAAAA ATATACATAC TAGTACCATT 1320 

ATCTTCTTTT GCCTAAAATT ACTAATGCAC CACGTCAGTC TGCTTCCTTC AGGCATCATT 1380 

CTCAATTCAT CAGGACTTGT ATTAGCAGGT TCTGGCTAGA GAGACTATCT CCTGTCATCA 1440 

CGATCAATTA ATGTTTTCTG GTGATCACAT CAGGCCCTAT CTAAGAAGCT CATGGTATAC 1500 

AAGGGTCACC CAAATAGCTG AGTGCAGTCC TTGCTCATAT TTCCTTCATC TTAACCCCGC 1560 

AAACAAGAAT TAAGATGATC CCAATAAAAG AAAAATTGCT CAGGAAACTG AACCTTTTTC 1620 

TGAACCAAGC ACTGTCAGCA AATCTCAGGT ATTAGAGCAA CTATGGTTGA TTGAAAAGTG 1680 

TCTCAAAATC TGGGCCAAGA ATGATTGCTA GGTCCATAAG CTAATTTGTC TGGCCTTGCC 1740 

ATTTACGTAA GCCAAAGAAA GTCACTCATG AGTAAACTAT AGAAAACGTT CAGACCCATC 1800 

CTGTTAGTAT GTCAAATCAA CTAAGACTGG CAGGGTATTA ACTCCATTCC AGGTGACATG 1860 

GATAAAGAGC CCCATTATTT TCACAGTGCC AGCCTCTACC TAAGGAAACC CTAGACCTTG 1920 

GAACCAGTTT CCTGGTAGGG AACTGCTGAC AGTTTCAATG CTGACAGTTG GAGCCAATGC 1980 

CTCATAGTGT AAACTGAAAG AAAAATAGTT GCTTTTTAAA ATGTCAGCAA GAAGGCCTGC 2040 

CTCATCTTAA CAAAGCAAAA AAAAATGCTT TAATTCAAAT TAAAAATCAT GATACTAAAA 2100 
AAAAAAAA 

SEQ ID NO:92 POVS Protefn sequence 
Protein Accession #: NP.057674 

1 11 21 31 41 51 

I I I I I I 

MQCQLFRTET SKAVSELNYD YICIKAGTGR PQGTPTIGLV LLVRWAIIYB TELQSQPIT 

SEQ fD NO:93 PEE6 DNA SEQUENCE 

Nucleic Add Accession*: NMJD02606 

Coding sequence: 61-1842 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGCGGCGGCT GGCGTCGGGA AAGTACAGTA AAAAGTCCGA GTGCAGCCGC CGGGCGCAGG 60 

ATGGGATCCG GCTCCTCCAG CTACCGGCCC AAGGCCATCT ACCTGGACAT CGATGGACGC 120 

ATTCAGAAGG TAATCTTCAG CAAGTACTGC AACTCCAGCG ACATCATGGA CCTGTTCTGC 180 

ATCGCCACCG GCCTGCCTCG GAACACGACC ATCTCCCTGC TGACCACCGA CGACGCCATG 240 

GTCTCCATCG ACCCCACCAT GCCCGCGAAT TCAGAACGCA CTCCGTACAA AGTGAGACCT 300 

GTGGCCATCA AGCAACTCTC CGCTGGTGTC GAGGACAAGA GAACCACAAG CCGTGGCCAG 360 

TCTGCTGAGA GACCACTGAG GGACAGACGG GTTGTGGGCC TGGAGCAGCC CCGGAGGGAA 420 

GGAGCATTTG AAAGTGGACA GGTAGAGCCC AGGCCCAGAG AGCCCCAGGG CTGCTACCAG 480 

GAAGGCCAGC GCATCCCTCC AGAGAGAGAA GAATTAATCC AGAGCGTGCT GGCGCAGGTT 540 

GCAGAGCAGT TCTCAAGAGC ATTCAAAATC AATGAACTGA AAGCTGAAGT TGCAAATCAC 600 

TTGGCTGTOC TAGAGAAACG CGTGGAATTG GAAGGACTAA AAGTGGTGGA GATTGAGAAA' 660 
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TGCAAGAGTG ACATTAAGAA 
TGCCCCTGTA AGTACAGTTT 
CCCACTTACC CCAAGTACCT 
TTTGACGTCT GGCTTTGGGA 
GACCTCGGGC TGGTCAGGGA 
TGTGTCCACG ACAACTACAG 
GCCCAGATGA TGTACAGCAT 
GATATCCTGA TCCTAATGAC 
AACACGTACC AGATCAATGC 
CTGGAGAACC ACCACTGCGC 
TTCTCCAACA TCCCACCTGA 
TTGGCCACTG ACATGGCAAG 
AATTTTGACT ACAGCAACGA 
TGTGATATCT CTAACGAGGT 
TTAGAGGAAT ATTTTATGCA 
TTCATGGACC GAGACAAAGT 
CTGATCCCAA TGTTTGAAAC 
CAGCCACTTT GGGAATCCCG 
AAAGAGTTAC AGAAGAAGAC 
AGAAGCAGAG ATGTGAAAAA 
CTGCAGTTCT GGACGGGCTG 
TGGGCACCTG GCACCACAAG 
AAAAAAAAAA A 

SEQ ID KO-.94 PEE6 Protein sequence 
Proteto Accession* NPJX)2597 

1 11 21 - 31 41 51 

I I I I I I 

HGSGSSSYRP KAIYLDIDGR IQKVIFSKYC NSSDIMDLFC IATGLPRNTT ISLLTTDDAM 60 

VSIDPTWPAN SERTPYKVRP VAIKQLSAGV EDKRTTSRGQ SAERPLRDRR WGLEQPRRE 120 

GAFESGQVEP RPREPQGCYQ EGQRIPPERE ELIQSVLAQV AEQFSRAFKI NELKAEVANH 180 

LAVLEKRVEL EGLKWEIEK CKSDIKKMRE ELAARSSRTN CPCKYSFLDN HKKLTPRRDV 240 

PTYPKYLLSP ETIEALRKPT FDVWLWEPNE MLSCLEHMYH DLGLVRDFSI NPVTLRRWLF 300 

CVHUNYRNNP FHNFRHCFCV AQMMYSMVWL CSLQEKFSQT DILILMTAAI CHDLDHPGYN 360 

NTYQINARTE LAVRYNDISP LENHHCAVAF QILAEPECNI FSNIPPDGFK QIRQGMITLI 420 

LATDMARHAE IMDSFKEKME NFDYSNEEHM TLLKMILIKC CDISNEVRPM EVAEPWVDCL 480 

LEEYFMQSDR EKSEGLPVAP FMDRDKVTKA TAQIGFIKFV LIPMFETVTK LFPMVEEIML 540 
QPLWESRDRY EELKRIDDAM KELQKKTDSL TSGATEKSRE RSRDVKNSEG DCA 

SEQ 10 NO:95 PEG4 DNA SEQUENCE 

Nucleic Acid Accession #: none 

Coding sequence: 41-559 {underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I 1 I 

CAGTCACAGG CGAGAGCCYT GGGATGCACC GGCCAGAGGC ATGCTGCTGC TGCTCACGCT 60 

TGCCCTCCTG GGGGGCCCCA CCTGGGCAGG GAAGATGTAT GGCCCTGGAG GAGGCAAGTA 120 

TTTCAGCACC ACTGAAGACT ACGACCATGA AATCACAGGG CTGCGGGTGT CTGTAGGTCT 180 

TCTCCTGGTG AAAAGTGTCC AGGTGAAACT TGGAGACTOC TGGGACGTGA AACTGGGAGC 240 

CTTAGGTGGG AATACCCAGG AAGTCACCCT GCAGCCAGGC GAATACATCA CAAAAGTCTT 300 

TGTCGCCTTC CAAGCTTTCC TCCGGGGTAT GGTCATGTAC ACCAGCAAGG ACCGCTATTT 360 

CTATTTTGGG AAGCTTGATG GCCAGATCTC CTCTGCCTAC CCCAGCCAAG AGGGGCAGGT 420 

GCTGGTGGGC ATCTATGGCC AGTATCAACT CCTTGGCATC AAGAGCATTG GCTTTGAATG 480 

GAATTATCCA CTAGAGGAGC CGACCACTGA GCCACCAGTT AATCTCACAT ACTCAGCAAA 540 

CTCACCCGTG GGTCGCTAGG GTGGGGTATG GGGCCATCCG AGCTGAGGCC ATCTGTGTGG 600 

TGGTGGCTGA TGGTACTGGA GTAACTGAGT CGGGACGCTG AATCTGAATC CACCAATAAA 660 
TAAAGCTTCT GCAGAATCAG TGAAAAAAAA A 



GATGAGGGAG GAGCTGGCGG CCAGAAGCAG CAGGACCAAC 720 

TTTGGATAAC CACAAGAAGT TGACTCCTCG ACGCGATGTT 780 

GCTCTCTCCA GAGACCATCG AGGCCCTGCG GAAGCCGACC 840 

GCCCAATGAG ATGCTGAGCT GCCTGGAGCA CATGTACCAC 900 

CTTCAGCATC AACCCTGTCA CCCTCAGGAG GTGGCTGTTC 960 

AAACAACCCC TTCCACAACT TCCGGCACTG CTTCTGCGTG 1020 

GGTCTGGCTC TGCAGTCTCC AGGAGAAGTT CTCACAAACG 1080 

AGCGGCCATC TGCCACGATC TGGACCATCC CGGCTACAAC 1140 

CCGCACAGAG CTGGCGGTCC GCTACAATGA CATCTCACCG 1200 

CGTGGCCTTC CAGATCCTCG CCGAGCCTGA GTGCAACATC 1260 

TGGGTTCAAG CAGATCCGAC AGGGAATGAT CACATTAATC 1320 

ACATGCAGAA ATTATGGATT CTTTCAAAGA GAAAATGGAG 1380 

GGAGCACATG ACCCTGCTGA AGATGATTTT GATAAAATGC 1440 

CCGTCCAATG GAAGTCGCAG AGCCTTGGGT GGACTGTTTA 1500 

GAGCGACCGT GAGAAGTCAG AAGGCCTTCC TGTGGCACCG 1560 

GACCAAGGCC ACAGCCCAGA TTGGGTTCAT CAAGTTTGTC 1620 

AGTGACCAAG CTCTTCCCCA TGGTTGAGGA GATCATGCTG 1680 

AGATCGCTAC GAGGAGCTGA AGCGGATAGA TGACGCCATG 1740 

TGACAGCTTG ACGTCTGGGG CCACCGAGAA GTCCAGAGAG 1800 

CAGTGAAGGA GACTGTGC CT GAG GAAAGCG GGGGGCGTGG 1860 

GCCGAGCTGC GCGGGATCCT TGTGCAGGGA AGAGCTGCCC 1920 

ACCATGTTTT CTAAGAACCA TTTTGTTCAC TGATACAAAA 1980 



SEQ ID MO:96PK4ProleTp sequence 
Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

I I I I 1 I 

MLLLLTLALL GGPTWAGKMY GPGGGKYFST TEDYDHEITG LRVSVGLLLV KSVQVKLGDS 60 
WDVKLGALGG NTQEVTLQPG EYITKVFVAF QAFLRGMVMY TSKDRYFYFG KLDGQISSAY 120 
PSQEGQVLVG IYGQYQLLGI KSIGFEWNYP LEEPTTEPPV NLTYSANSPV GR 



SEQ ID N0:97 PELS DNA SEQUENCE 

Nuddc Acid Accession*: NMJW6953 

Coding sequence: 33-89 6{underiined sequences correspond to start and stop codons) 

1 U 21 31 41 51 

I 1 I 1 I I 

CCGTTCCGCG CTCTGGCGGC TCCTCCCGGG CGATGCCTCC GCTCTGGGCC CTGCTGGCCC 60 

TCGGCTGCCT GCGGTTCGGC TCGGCTGTGA ACCTGCAGCC CCAACTGGCC AGTGTGACTT 120 

TCGCCACCAA CAACCCCACA CTTACCACTG TGGCCTTGGA AAAGCCTCTC TGCATGTTTG 180 

ACAGCAAAGA GGCCCTCACT GGCACCCACG AGGTCTACCT GTATGTCCTG GTCGACTCAG 240 

CCATTTCCAG GAATGCCTCA GTGCAAGACA GCACCAACAC CCCACTGGGC TCAACGTTCC 300 
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10 



TACAAACAGA GGGTGGGAGG ACAGGTCCCT ACAAAGCTGT GGCCTTTGAC CTGATCCCCT 360 

GCAGTGACCT GCCCAGCCTG GATGCCATTG GGGATGTGTC CAAGGCCTCA CAGATCCTGA 420 

ATGCCTACCT GGTCAGGGTG GGTGCCAACG GGACCTGCCT GTGGGATCCC AACTTCCAGG 480 

GCCTCTGTAA CGCACCCCTG TCGGCAGCCA CGGAGTACAG GTTCAAGTAT GTCCTGGTCA 540 

ATATGTCCAC GGGCTTGGTA GAGGACCAGA CCCTGTGGTC GGACCCCATC CGCACCAACC 600 

AGCTCACCCC ATACTCGACG ATCGACACGT GGCCAGGCCG GCGGAGCGGA GGCATGATCG 660 

TCATCACTTC CATCCTGGGC TCCCTGCCCT TCTTTCTACT TGTGGGTTTT GCTGGCGCCA 720 

TTGCCCTCAG CCTCGTGGAC ATGGGGAGTT CTGATGGGGA AACGACTCAC GACTCCCAAA 780 

TCACTCAGGA GGCTGTTCCC AAGTCGCTGG GGGCCTCGGA GTCTTCCTAC ACGTCCGTGA 840 

ACCGGGGGCC GCCACTGGAC AGGGCTGAGG TGTATTCCAG CAAGCTCCAA GACTGAGCCC 900 

AGCACCACCC CTGGGCAGCA GCATCCTCCT CTCTGGCCTT GCCCCAGGCC CTGCAGCGGT 960 

GGTTGTCACA CCCTGACTTC AGGGAAGGTG AAACAGGGCT TGTCCCTCCA ACTGCAGGAA 1020 
AACCCTTAAT AAAATCTTCT GATGAGTTCT AAAAAAAAA 

15 SEQ ID NO:88 Protein se^ence. 
Protein Accession #: NP_008884 

1 11 21 31 41 51 

nn I I I I I I 

Zi) MPPLWALLAL GCLRFGSAVN LQPQLASVTF ATNNPTLTTV ALEKFLCMFD SKEALTGTHE 60 

VYLYVLVDSA ISRNASVQDS TNTPLGSTFL QTEGGRTGPY KAVAFDLIPC SDLPSLDAIG 120 

DVSKASQILN AYLVRVGANG TCLWDPNFQG LCNAPLSAAT EYRFKYVLVN MSTGLVEDQT 180 

LWSDPIRTNQ LTPYSTIDTW PGRRSGGMIV ITSILGSLPF FLLVGFAGAI ALSLVDHGSS 240 
DGETTHDSQI TQEAVPKSLG ASESSYTSVN RGPPLDRAEV YSSKLQD 



25 



60 



SEQ ID NO:99 PEN1 DNA SEQUENCE 

Nucleic Acid Accession*: NM_0 12391 

Coding sequence: 416-1423 (undefined sequences correspond to start and stop codons) 



30 1 11 21 31 41 51 

I I I I 1 I 

GTCTGACTTC CTCCCAGCAC ATTCCTGCAC TCTGCCGTGT CCACACTGCC CCACAGACCC 60 

AGTCCTCCAA GCCTGCTGCC AGCTCCCTGC AAGCCCCTCA GGTTGGGCCT TGCCACGGTG 120 

- CCAGCAGGCA GCCCTGGGCT GGGGGTAGGG GACTCCCTAC AGGCACGCAG CCCTGAGACC 180 
35 TCAGAGGGCC ACCCCTTGAG GGTGGCCAGG CCCCCAGTGG CCAACCTGAG TGCTGCCTCT 240 

GCCACCAGCC CTGCTGGCCC CTGGTTCCGC TGGCCCCCCA GATGCCTGGC TGAGACACGC 300 

CAGTGGCCTC AGCTGCCCAC ACCTCTTCCC GGCCCCTGAA GTTGGCACTG CAGCAGACAG 360 

CTCCCTGGGC ACCAGGCAGC TAACAGACAC AGCCGCCAGC CCAAACAGCA GCGGCATGGG 420 

CAGCGCCAGC CCGGGTCTGA GCAGCGTATC CCCCAGCCAC CTCCTGCTGC CCCCCGACAC 480 

40 GGTGTCGCGG ACAGGCTTGG AGAAGGCGGC AGCGGGGGCA GTGGGTCTCG AGAGACGGGA 540 

CTGGAGTCCC AGTCCACCCG CCACGCCCGA GCAGGGCCTG TCCGCCTTCT ACCTCTCCTA 600 

CTTTGACATG CTGTACCCTG AGGACAGCAG CTGGGCAGCC AAGGCCCCTG GGGCCAGCAG 660 

TCGGGAGGAG CCACCTGAGG AGCCTGAGCA GTGCCCGGTC ATTGACAGCC AAGCCCCAGC 720 

- - GGGCAGCCTG GACTTGGTGC CCGGCGGGCT GACCTTGGAG GAGCACTCGC TGGAGCAGGT 780 

45 GCAGTCCATG GTGGTGGGCG AAGTGCTCAA GGACATCGAG ACGGCCTGCA AGCTGCTCAA 840 

CATCACCGCA GATCCCATGG ACTGGAGCCC CAGCAATGTG CAGAAGTGGC TCCTGTGGAC 900 

AGAGCACCAA TACCGGCTGC CCCCCATGGG CAAGGCCTTC CAGGAGCTGG CGGGCAAGGA 960 

GCTGTGCGCC ATGTCGGAGG AGCAGTTCCG CCAGCGCTCG CCCCTGGGTG GGGATGTGCT 1020 

GCACGCCCAC CTGGACATCT GGAAGTCAGC GGCCTGGATG AAAGAGCGGA CTTCACCTGG 1080 

50 GGCGATTCAC TACTGTGCCT CGACCAGTGA GGAGAGCTGG ACCGACAGCG AGGTGGACTC 1140 

ATCATGCTCC GGGCAGCCCA TCCACCTGTG GCAGTTCCTC AAGGAGTTGC TACTCAAGCC 1200 

CCACAGCTAT GGCCGCTTCA TTAGGTGGCT CAACAAGGAG AAGGGCATCT TCAAAATTGA 1260 

GGACTCAGCC CAGGTGGCCC GGCTGTGGGG CATCCGCAAG AACCGTCCCG CCATGAACTA 1320 

CGACAAGCTG AGCCGCTCCA TCCGCCAGTA TTACAAGAAG GGCATCATCC GGAAGCCAGA 1380 

55 CATCTCCCAG CGCCTCGTCT ACCAGTTCGT GCACCCCATC TGAGTGCCTG GCCCAGGGCC 1440 

TGAAACCCGC CCTCAGGGGC CTCTCTCCTG CCTGCCCTGC CTCAGCCAGG CCCTGAGATG 1500 

GGGGAAAACG GGCAGTCTGC TCTGCTGCTC TGACCTTCCA GAGCCCAAGG TCAGGGAGGG 1560 

GCAACCAACT GCCCCAGGGG GATATGGGTC CTCTGGGGCC TTCGGGACCA TGGGGCAGGG 1620 

GTGCTTCCTC CTCAGGCCCA GCTGCTCCOC TGGAGGACAG AGGGAGACAG GGCTGCTCCC 1680 

CAACACCTGC CTCTGACCCC AGCATTTCCA GAGCAGAGCC TACAGAAGGG CAGTGACTCG 1740 

ACAAAGGCCA CAGGCAGTCC AGGCCTCTCT CTGCTCCATC CCCCTGCCTC CCATTCTGCA 1800 

CCACACCTGG CATGGTGCAG GGAGACATCT GCACCCCTGA GTTGGGCAGC CAGGAGTGCC 1860 
CCCGGGAATG GATAATAAAG ATACTAGAGA ACTG 

65 SEQIDNQ:100 PEN1 Protein sequence 
Protein Accession #: NP„036523 

1 11 21 31 41 51 

nCi I I I I I I 

/U MGSASPGLSS VSPSHLLLPP DTVSRTGLEK AAAGAVGLER RDWSPSPPAT PEQGLSAFYL 60 

SYFDMLYPED SSWAAKAPGA SSREEPPEBP EQCPVIDSQA PAGSLDLVPG GLTLEEHSLE 120 

QVQSMWGEV LKDIETACKL LNITADPMDW SPSNVQKWLL WTEHQYRLPP MGKAFQELAG 180 

KELCAMSEEQ FRQRSPLGGD VLHAHLDIWK SAAWMKERTS PGAIHYCAST SEESWTDSEV 240 

- DSSCSGQPIH LWQFLKELLL KPHSYGRFIR WLNKEKGIFK IEDSAQVARL WGIRKNRPAM 300 
75 NYDKLSRSIR QYYKKGIIRK POISQRLVYQ FVHPI 

SEQ (0 NO:101 PEN3 DNA SEQUENCE 

Nucleic Add Accession*: NM.000742 

Coding sequence: 655-21 U (underlined sequences correspond to start and stop codons) 

337 



80 



WO 02/30268 



1 11 21 31 41 51 

I I I I I I 

GAGAGAACAG CGTGAGCCTG TGTGCTTGTG TGCTGAGCCC TCATCCCCTC CTGGGGCCAG 60 

GCTTGGGTTT CACCTGCAGA ATCGCTTGTG CTGGGCTGCC TGGGCTGTCC TCAGTGGCAC 120 

CTGCATGAAG OCGTTCTGGC TGCCAGAGCT GGACAGCCCC AGGAAAACCC ACCTCTCTGC 180 

AGAGCTTGCC CAGCTGTCCC CGGGAAGCCA AATGCCTCTC ATGTAAGTCT TCTGCTCGAC 240 

GGGGTGTCTC CTAAACCCTC ACTCTTCAGC CTCTGTTTGA CCATGAAATG AAGTGACTGA 300 

GCTCTATTCT GTACCTGCCA CTCTATTTCT GGGGTGACTT TTGTCAGCTG CCCAGAATCT 360 

CCAAGCCAGG CTGGTTCTCT GCATCCTTTC AATGACCTGT TTTCTTCTGT AACCACAGGT 420 

TCGGTGGTGA GAGGAAGCCT CGCAGAATCC AGCAGAATCC TCACAGAATC CAGCAGCAGC 480 

TCTGCTGGGG ACATGGTCCA TGGTGCAACC CACAGCAAAG CCCTGACCTG ACCTCCTGAT 540 

GCTCAGGAGA AGCCATGGGC CCCTCCTGTC CTGTGTTCCT GTCCTTCACA AAGCTCAGCC 600 

TGTGGTGGCT CCTTCTGACC CCAGCAGGTG GAGAGGAAGC TAAGCGCCCA CCTCCCAGGG 660 

CTCCTGGAGA CCCACTCTCC TCTCCCAGTC CCACGGCATT GCCGCAGGGA GGCTCGCATA 720 

CCGAGACTGA GGACCGGCTC TTCAAACACC TCTTCCGGGG CTACAACCGC TGGGCGCGCC 780 

CGGTGCCCAA CACTTCAGAC GTGGTGATTG TGCGCTTTGG ACTGTCCATC GCTCAGCTCA 840 

TCGATGTGGA TGAGAAGAAC CAAATGATGA CCACCAACGT CTGGCTAAAA CAGGAGTGGA 900 

GCGACTACAA ACTGCGCTGG AACCCCGCTG ATTTTGGCAA CATCACATCT CTCAGGGTCC 960 

CTTCTGAGAT GATCTGGATC CCCGACATTG TTCTCTACAA CAATGCAGAT GGGGAGTTTG 1020 

CAGTGACCCA CATGACCAAG GCCCACCTCT TCTCCACGGG CACTGTGCAC TGGGTGCCCC 1080 

CGGCCATCTA CAAGAGCTCC TGCAGCATCG ACXJTCACCTT CTTCCCCTTC GACCAGCAGA 1140 

ACTGCAAGAT GAAGTTTGGC TCCTGGACTT ATGACAAGGC CAAGATCGAC CTGGAGCAGA 1200 

TGGAGCAGAC TGTGGACCTG AAGGACTACT GGGAGAGCGG CGAGTGGGCC ATCGTCAATG 1260 

CCACGGGCAC CTACAACAGC AAGAAGTACG ACTGCTGCGC CGAGATCTAC CCCGACGTCA 1320 

CCTACGCCTT CGTCATCCGG CGGCTGCCGC TCTTCTACAC CATCAACCTC ATCATCCCCT 1380 

GCCTGCTCAT CTCCTGCCTC ACTGTGCTGG TCTTCTACCT GCCCTCCGAC TGCGGCGAGA 1440 

AGATCACGCT GTGCATTTCG GTGCTGCTGT CACTCACCGT CTTCCTGCTG CTCATCACTG 1500 

AGATCATCCC GTCCACCTCG CTGGTCATCC CGCTCATCGG CGAGTACCTG CTGTTCACCA 1560 

TGATCTTCGT CACCCTGTCC ATCGTCATCA CCGTCTTCGT GCTCAATGTG CACCACCGCT 1620 

CCCCCAGCAC CCACACCATG CCCCACTGGG TGCGGGGGGC CCTTCTGGGC TGTGTGCCCC 1680 

GGTGGCTTCT GATGAACCGG CCCCCACCAC CCGTGGAGCT CTGCCACCCC CTACGCCTGA 1740 

AGCTCAGCCC CTCTTATCAC TGGCTGGAGA GCAACGTGGA TGCCGAGGAG AGGGAGGTGG 1800 

TGGTGGAGGA GGAGGACAGA TGGGCATGTG CAGGTCATGT GGCCCCCTCT GTGGGCACCC 1860 

TCTGCAGCCA CGGCCACCTG CACTCTGGGG CCTCAGGTCC CAAGGCTGAG GCTCTGCTGC 1920 

AGGAGGGTGA GCTGCTGCTA TCACCCCACA TGCAGAAGGC ACTGGAAGGT GTGCACTACA 1980 

TTGCCGACCA CCTGCGGTCT GAGGATGCTG ACTCTTCGGT GAAGGAGGAC TGGAAGTATG 2040 

TTGCCATGGT CATCGACAGG ATCTTCCTCT GGCTGTTTAT CATCGTCTGC TTCCTGGGGA 2100 

CCATCGGCCT CTTTCTGCCT CCGTTCCTAG CTGGAATGAT CTGACTGCAC CTCCCTCGAG 2160 

CTGGCTCCCA GGGCAAAGGG GAGGGTTCTT GGATGTGGAA GGGCTTTGAA CAATGTTTAG 2220 

ATTTGGAGAT GAGCCCAAAG TGCCAGGGAG AACAGCCAGG TGAGGTGGGA GGTTGGAGAG 2280 

CCAGGTGAGG TCTCTCTAAG TCAGGCTGGG GTTGAAGTrT GGAGTCTGTC CGAGTTTGCA 2340 

GGGTGCTGAG CTGTATGGTC CAGCAGGGGA GTAATAAGGG CTCTTCCGGA AGGGGAGGAA 2400 

GCGGGAGGCA GGCCTGCACC TGATGTGGAG GTACAGGCAG ATCTTCCCTA CCGGGGAGGG 2460 

ATGGATGGTT GGATACAGGT GGCTGGGCTA TTCCATCCAT CTGGAAGCAC ATTTGAGCCT 2520 

CCAGGCTTCT CCTTGACGTC ATTCCTCTCC TTCCTTGCTG CAAAATGGCT CTGCACCAGC 2580 

CGGCCCCCAG GAGGTCTGGC AGAGCTGAGA GCCATGGCCT GCAGGGGCTC CATATGTCCC 2640 
TACGCGTGCA GCAGGCAAAC AAGA 

$EQ tp WQ:tQ2 pE|^ prpjeJn sequence 
Protein Accession*: NPJXW733 

1 11 21 31 41 51 

I I I I I I 

KGPSCPVFLS FTKLSLWWLL LTPAGGEEAK RPPPRAPGDP LSSPSPTALP OGGSHTETED 60 

RLFKHLFRGY NRWARPVFNT SDWIVRFGL SIAQLIDVDE KNQMMTTNVW LKQEWSDYKL 120 

RWNPADFGNI TSLRVPSEKI WIPDIVLYNN ADGEFAVTHM TKAHLFSTGT VHWVPPAIYK 180 

SSCSIDVTFF PFDQQNCKMK FGSWTYDKAK IDLEQMEQTV DLKDYWESGE WAIVNATGTY 240 

NSKKYDCCAE IYPDVTYAFV IRRLPLFYTI NLIIPCLLIS CLTVLVFYLP SDCGEKITLC 300 

ISVLLSLTVF LLLITEIIPS TSLVIPLIGE YLLFTMIFVT LSIVITVFVL NVHHRSPSTH 360 

TMPHWVFGAL LGCVPRWLLM NRPPPPVELC HPLRLKLSPS YHWLESNVDA EEREWVEEE 420 

DRWACAGHVA PSVGTLCSHG HLHSGASGPK AEALLQEGEL LLSPHMQKAL EGVHYIADHL 480 
RSEDADSS VK EDWKYVAMVI DRIFLWLFII VCFLGTIGLF LPPFLAGMI 

SEC ID NO-.103 PEU4 DNA SEQUENCE 

Nucleic Acid Accession #: NMJH8670 

Coding sequence: 87-893 (underlined sequences correspond to start and stop codons) 

l ll 21 31 41 51 

I I I I I I 

CACGAGGCTG GAAGGGGCCA CTTCACACCT CGGGCTCGGC ATAAAGCGGC CGCCGGCCGC 60 

CGGCCCCCAG ACGCGCCGCC GCTGCCATGG CCCAGCCCCT GtGCCCGCCG CTCTCCGAGT 120 

CCTGGATGCT CTCTGCGGCC TGGGGCCCAA CTCGGCGGCC GCCGCCCTCC GACAAGGACT 180 

GCGGCCGCTC CCTCGTCTCG TCCCCAGACT CATGGGGCAG CACCCCAGCC GACAGCCCCG 240 

TGGCGAGCCC CGCGCGGCCA GGCACCCTCC GGGACCCCCG CGCCCCCTCC GTAGGTAGGC 300 

GCGGCGCGCG CAGCAGCCGC CTGGGCAGCG GGCAGAGGCA GAGCGCCAGT GAGCGGGAGA 360 

AACTGCGCAT GCGCACGCTG GCCCGCGCCC TGCACGAGCT GCGCCGCTTT CTACCGCCGT 420 

CCGTGGCGCC CGCGGGCCAG AGCCTGACCA AGATCGAGAC GCTGCGCCTG GCTATCCGCT 480 

ATATCGGCCA CCTGTCGGCC GTGCTAGGCC TCAGCGAGGA GAGTCTCCAG CGCCGGTGCC ' 540 

GGCAGCGCGG TGACGCGGGG TCCCCTCGGG GCTGCCCGCT GTGCCCCGAC GACTGCCCCG 600 

CGCAGATGCA GACACGGACG CAGGCTGAGG GGCAGGGGCA GGGGCGCGGG CTGGGCCTGG 660 
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10 
15 
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35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TATCCGCCGT 
CTGCACCCGA 
AGGCGATGGA 
AGACCTGGAT 
CAACPGACGC 
CCTTGGCAGA 
GGGTGAGAGC 
ATAGGGCTAG 
TGAATAAACT 



CCGCGCCGGG 
GCCGCGCGAC 
GCCAAGCCCA 
GCCCCTCTCG 
CGTCTCTGTG 
CTGCCTTTCC 
CGTCCCCACC 
ACACTTTGAG 
GTACTGGTGT 



GCGTCCTGGG 
CCGCCTGCGC 
CCGTCCCCGC 
CCTCTGGAGT 
AGCACCGAGG 
TGGAAGAGGG 
GCGGCGGCCC 
GCAAGCAGGA 
CAAAAAAAAA 



GATCCCCGCC 
TGTTCGCCGA 
TCCTTCCGGG 
GGCTGCCTGA 
CTTTTTGGCC 
CACGGGCGAT 
TTCTCAGCCC 
GGCTCTGCCT 
AAAAAAAAAA 



TGCCTGCCCC 
GGCGGCGTGC 
CGACGTGCTG 
GGAGCCCAAG 
TCAGCACCTT 
CCCGACGGGG 
CTCCCTCCAT 
AATGTGAATT 
A 



GGAGCCCGAG 
CCGGAAGGGC 
GCTCTGTTGG 
TGACAAGGGA 
CGAAGTGGTT 
GCATTCCTGC 
GGAGGGACCC 
TATTTATTTG 



gEQPfjQiWPWPWtglnWWro 
Protein Accession*: NPJ)6ll40 



51 



1 
I 

CCACGGAGAA 
ACAGCAATTT 
CACGCACATG 
GCCCCGTCCT 
AGAGCACAGG 
GTGTGGCTGT 
GTGTGGCCCC 
TCCCTGCGAG 
ACAACTACTC 
ACCGCTTCCG 
CTGGAATTGA 
GAATAGAGAA 
CTGCGGACTG 
GGCAAGGCGA 
TGCAGGCCCA 
AGGATGGGTC 
GCTCGGAGGC 
ACATTGCCCA 
CTTCCCTCAT 
ACGGCCTCAG 
CGCCCTCCAA 
AAGCCCCAGC 
TGAGGATGCT 
CTCACCCAGG 
CGCTCTCGCT 
TGTTGCTGAA 
CCTCAGCTCT 
AGGAGGCAGC 
TTGGCGAGTG 
CGCTCTGGGG 
TTGCCCAGGA 
CTACACCCAT 
TCATCACCTT 
ATAGTGTCAT 

GGTGCCTACG 
TGGTCAGCTA 
CGGCGCCGCC 
AGGAACTGCG 
CTGGCCATGC 
GCGACCTAGT 
TGTACCAOCT 
TTCACATCTT 
TGAAGGACGT 
CCACGGAGGG 
TCTACCGTCC 
TCATGGAGCA 
AGGCGGGCAC 
TCCTGCTCGT 
TCGGCAAAGT 
GGGAATTCCA 
TCCTGCTCAG 
AGCATTTCCG 



11 

I 

GCCCACCGAT 
CCTCCGGCTC 
GGGCTTCCGT 
CCAGACCTGG 
AGCCTGGATT 
ACGGGACCAT 
CTGGGGTGTG 
GTACCGGTGG 
GGCCTTCTTC 
CTTGCGCCTG 
CATCCCTGTC 
CGCCACCCAG 
CCTGGCGGAG 
AGCCCGAGAT 
GGTGGAGAGG 
TGAGGAATTC 
CTCAGCCTAC 
GAGTGAACTC 
GGACGCCCTG 
CCTGGGCCAC 
CTCGCTCATC 
CCTAAAAGGG 
GCTGGGGAAG 
CCAGGGCTTC 
GGATGCTGGC 
CAGGGCACAG 
TGGGGCCTGT 
ACGGAGGAAA 
CTATCGCAGC 
GGATGCCACT 
TGGGGTACAG 
CTGGGCCCTG 
CAGGAAATCA 
TAATGGGGAA 
GCGCCAGTCG 
CCGCTGGTTC 
CCTGCTGTTC 
CGGCTCGCTG 
CCAGGGCCTG 
CTCACTGAGC 
GGCTCTCACC 
GGGCCGCACT 
CACGGTCAAC 
GTTCTTCTTC 
GCTCCTGAGG 
CTACCTGCAG 
CAGCAACTGC 
CTGCGTCTCC 
GGCCAACATC 
ACAGGGCAAC 
CTCTCGGCCC 
GCAATTGTGC 
GGTTTACCTT 



21 
I 

GCCTACGGAG 
TCTGACCGAA 
GOCCCGAACC 
CTGCAGGACC 
GTCACTGGGG 
CAGATGGCCA 
GTCCGGAATA 
CGCGGTGACC 
CTGGTGGACG 
GAGTCCTACA 
CTGCTCCTCC 
GCTCAGCTCC 
ACCCTGGAAG 
CGAATCAGGC 
ATTATGACCC 
GAGACCATAG 
CTGGATGAGC 
TTTCGGGGGG 
CTGAATGACC 
TTCCTGACCC 
CGCAACCTTT 
GGAGCTGCGG 
ATGTGCGCGC 
GGGGAGAGCA 
CTCGGGCAGG 
ATGGCCATGT 
TTGCTGCTCC 
GACCTGGCGT 
AGTGAGGTGA 
TGCCTCCAGC 

GTTCTCGCCT 
GAAGAGGAGC 
GGGCCTGTCG 
GGCCGTCCGG 
CACTTCTGGG 
TTGCTGCTTT 
GAGCTGCTGC 
AGCGGAGGCG 
CAGCGCCTGC 
TGCTTCCTCC 
GTCCTCTGCA 
AAACAGCTGG 
CTCTTCTTCC 
CCACGGGACA 
ATCTTCGGGC 
TCGTCGGAGC 
CAGTATGCCA 
CTGCTGGTCA 
AGCGATCTCT 
GCGCTGGCCC 
AGGCGACCCC 
TCTAAGGAAG 



31 
I 

AGCTGGACTT 
CGGATCCAGC 
TGGTGGTGTC 
TGCTGCGTCG 
GTCTGCACAC 
GCACTGGGGG 
GAGACACCCT 
CGGAGGACGG 
ACGGCACACA 
TCTCACAGCA 
TGATTGATGG 
CATGTCTCCT 
ACACTCTGGC 
GTTTCTTTCC 
GGAAGGAGCT 
TTTTGAAGGC 
TGCGTTTGGC 
ACATCCAATG 
GGCCTGAGTT 
CGATGCGCCT 
TGGACCAGGC 
AGCTCCGGCC 
CGAGGTACCC 
TGTATCTGCT 
CCCCCTGGAG 
ACTTCTGGGA 
GGGTGATGGC 
TCAAGTTTGA 
GGGCTGCCCG 
TGGCCATGCA 
CACAGAAGTG 
TCTTTTGCCC 
CCACACGGGA 



GTTGCTGCGG 
GCGCGCCGGT 
TCTCGCGGGT 
TCTATTTCTG 
GGGGCAGCCT 
GCCTCTACCT 
TGGGCGTGGG 
TCGACTTCAT 
GGCCCAAGAT 
TCGGCGTGTG 
GTGACTTCCC 
AGATTCCCCA 
CCGGCTTCTG 
ACTGGCTGGT 
ACTTGCTCAT 
ACTGGAAGGC 
CGCCCTTTAT 
GGAGCCCCCA 
CCGAGCGGAA 



41 

I 

CACGGGGGCC 
TGCAGTTTAT 
AGTGCTGGGG 
TGGGCTGGTG 
GGGCATCGGC 
CACCAAGGTG 
CATCAACCCC 
GGTCCAGTTT 
CGGCTGCCTG 
GAAGACGGGC 
TGATGAGAAG 
CGTGGCTGGC 
CCCAGGGAGT 
CAAAGGGGAC 
CCTGACAGTC 
OCTTGTGAAG 
TGTGGCTTGG 
GCGGTCCTTC 
CGTGCGCTTG 
GGCCCAACTC 
GTCCCACAGC 
CCCTGACGTG 
CTCCGGGGGC 
CTCGGACAAG 
CGACCTGCTT 
GATGGGTTCC 
ACGCCTGGAG 
GGGGATGGGC 
CCTCCTCCTC 
AGCTGACGCC 
GTGGGGAGAT 
TCCACTCATC 
GGAGCTAGAG 
CCCAGCCGAG 
GGGCCGCTGC 
GACCATCTTC 
GCTGCTCGTG 
GGCTTTCACG 
CGCCAGCGGG 
CGCCGACAGC 
CTGCCGGCTG 
GGTTTTCACG 
CGTCATCGTG 
GCTGGTAGCC 
AAGTATCCTG 
GGAGGACATG 
GGCACACCCT 
GGTGCTGCTC 
TGCCATGTTC 
GCAGCGTTAC 
CGTCATCTCC 
GCCGTCCTCC 
GCTGCTAACG 



51 

I 

GGCCGCAAGC 
AGTCTGGTCA 
GGATCGGGGG 
CGGGCTGCCC 
CGGCATGTTG 
GTGGCCATGG 
AAGGGCTCGT 
CCCCTGGACT 
GGGGGCGAGA 
GTGGGAGGGA 
ATGTTGACGC 
TCAGGGGGAG 
GGGGGAGCCA 
CTTGAGGTCC 
TATTCTTCTG 
GCCTGTGGGA 
AACCGCGTGG 
CATCTCGAAG 
CTCATTTCCC 
TACAGCGCGG 
GCAGGCACCA 
GGGCATGTGC 
GCCTGGGACC 
GCCACCTCGC 
CTTTGGGCAC 
AATGCAGTTT 
CCTGACGCTG 
GTTGACCTCT 
CGTCGCTGCC 
CGTGCCTTCT 
ATGGCCAGCA 
TACACCCGCC 
TTTGACATGG 
AAGACGCCGC 
GGGGGGCGCC 
ATGGGCAACG 
GATTTCCAGC 
CTGCTGTGCG 
GGCCCCGGGC 
TGGAACCAGT 
ACCCCGGGTT 
GTGCGGCTGC 
AGCAAGATGA 
TATGGCGTGG 
CGCCGCGTCT 
GACGTGGCCC 
CCTGGGGCCC 
CTCGTCATCT 
AGTTACACAT 
CGCCTCATCC 
CACTTGCGCC 
CCGGCCCTCG 
TGGGAATCGG 



720 
780 
840 
900 
960 
1020 
1080 
1140 



1 11 21 31 41 

I I I I I I 

MAQFLCPFLS ESWMLSAAWG PTRRPPPSDK DCGRSLVSSP DSWGSTPADS FVASPARPGT 
LROPRAPSVG RRGARSSRLG SGQRQSASER EKLRMRTLAR ALHELRRFLP PSVAPAGQSL 
TKIETLRLAI RYIGHLSAVL GLSEESLQRR CRQRGDAGSP RGCPLCPDDC PAQMQTRTQA 
EGQGQGRGLG LVSAVRAGAS WGSPPACPGA RAAPEPRDPP ALPAEAACPE GQAMEPSPPS 
PLLPGDVLAL LETWMPLSPL EWLPEEPK 

SEQ ID N0:1 05 PEU5 DNA SEQUENCE 

Nudeic Acid Accession*: NM.017636 

Coding sequence: 324-3374 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
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TGCATAAGGA GAACTTTCTG CTGGCACGCG CTAGGGACAA GCGGGAGAGC GACTCCGAGC 3240 

GTCTGGAGCG CACGTCCCAG AAGGTGGACT TGGCACTGAA ACAGCTGGGA CACATCCGCG 3300 

AGTACGAACA GCGCCTGAAA GTGCTGGAGC GGGAGGTCCA GCAGTGTAGC CGCGTCCTGG 3360 

GGTGGGTGAC GTAGGCCGTT AGCAGCTCTG CCATGTTGCC CTCAGGTGGG CCGCCACCCC 3420 

5 TTGACCTGCA TGGGTCCAAA GAGTGAGCCA TGCTGGCGGA TTTTAAGGAG AAGCCCCCAC 3480 

AGGGGATTTT GCTCTTAGAG TAAGGCTCAT GTGGGCCTCG GCCCCCGCAC CTGGTGGCCT 3540 

TGTCCTTGAG GTGAGCCCCA TGTCCATCTG GGCCACTGTC AGGACCACCT TTGGGAGTGT 3600 

CATCCTTACA AACCACAGCA TGCCCGGCTC CTCCCAGAAC CAGTCCCAGC CTGGGAGGAT 3660 

CAAGGCCTGG ATCCCGGGCC GTTATCCATC TGGAGGCTGC AGGGTCCTTG GGGTAACAGG 3720 

10 GACCACAGAC CCCTCACCAC TCACAGATTC CTCACACTGG GGAAATAAAG CCATTTCAGA 3780 
GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 



15 
20 
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WqiPWPflWPEUgPwltfngwwnw 
Protein Accession #: NPJ>60106 



1 11 21 31 41 51 

I II I I I 

MASTGGTKW AMGVAFWGW RNRDTLINPK GSFFARYRWR GDPEDGVQFP LDYNYSAFFL 60 

VDDGTHGCLG GENRFRLRLE SYISQQKTGV GGTGIDIPVL LLLIDGDEKM LTRIENATQA 120 

QLPCLLVAGS GGAADCLAET LEDTLAPGSG GARQGEARDR 1RRFFPKGDL EVLQAQVERI 180 

HTRKELLTVY SSEDGSEEFE TIVLKALVKA CGSSEASAYL DELRLAVAWN RVDIAQSELF 240 

RGDIQWRSFH LEASLMDALL NDRPEFVRLL ISHGLSLGHF LTPMRLAQLY SAAPSNSLIR 300 

NLLDQASHSA GTKAPALKGG AAELRPPDVG HVLRMLLGKM CAPRYPSGGA WDPHPGQGFG 360 

ESMYLLSDKA TSPLSLDAGL GQAPWSDLLL WALLLNRAQM AMYFWEMGSN AVSSALGACL 420 

LLRVMARLEP DAEEAARRKD LAFKFEGMGV DLFGECYRSS EVRAARLLLR RCPLWGDATC 480 

LQLAMQADAR APFAQDGVQS LLTQKWWGDM ASTTPIWALV LAFFCPPLIY TRLITFRKSE 540 

EEPTREELEF DMDSVINGEG PVGTADPAEK TPLGVPRQSG RPGCCGGRCG GRRCLRRWFH 600 

FWGAPVTIFM GNWSYLLFL LLFSRVLLVD FQPAPPGSLE LLLYFWAFTL LCEELRQGLS 660 

or . GGGGSLASGG PGPGHASLSQ RLRLYLADSW NQCDLVALTC FLLGVGCRLT PGLYHLGRTV 720 

30 LCIDFMVFTV RLLHIFTVNK QLGPKIVIVS KMMKDVFFFL FFLGVWLVAY GVATEGLLRP 780 

RDSDFPSILR RVFYRPYLQI FGQIPQEDMD VALKEHSNCS SEPGFWAHPP GAQAGTCV5Q 840 

YANVJLWLLL VIFLLVANIL LVNLLIAMFS YTFGKVQGNS DLYVJKAQRYR LIREFHSRPA 900 

LAPPFIVISH LRLLLRQLCR RPRSPQPSSP ALEHFRVYLS KEAERKLLTW ESVHKENFLL 960 
ARARDKRESD SERLERTSQK VDLALKQLGH IREYEQRLKV LEREVQQCSR VLGWVT 



SEQ ID N0:107 PEW3 DNA SEQUENCE 

Nucleic Acid Accession*: NM_005982 

Coding sequence: 276*1 130 (underlined sequences correspond to start and stop codons) 



40 1 11 21 31 41 51 

11)111 

GGTAGCAGCA TCCACCGGGC GGGAGGTCGG AGGCAGCAAG GCCTTAAAGG CTACTGAGTG 60 

CGCCGGCCGT TCCGTGTCCA GAACCTCCCC TACTCCTCCG CCTTCTCTTC CTTGGCCGCC 120 

CACCGCCAAG TTCCGACTCC GGTTTTCGCC TTTGCAAAGC CTAAGGAGGA GGTTAGGAAC 180 

45 AGCCGCGCCC CCCTCCCTGC GGCCGCCGCC CCCTGCCTCT CGGCTCTGCT CCCTGCCGCG 240 

TGCGCCTGGG CCGTGCGCCC CGGCAGGCGC CAGCCATGTC GATGCTGCCG TCGTTTGGCT 300 

TTACGCAGGA GCAAGTGGCG TGCGTGTGCG AGGTTCTGCA GCAAGGCGGA AACCTGGAGC 360 

GCCTGGGCAG GTTCCTGTGG TCACTGCCCG CCTGCGACCA CCTGCACAAG AACGAGAGCG 420 

TACTCAAGGC CAAGGCGGTG GTCGCCTTCC ACCGCGGCAA CTTCCGTGAG CTCTACAAGA 480 

50 TCCTGGAGAG CCACCAGTTC TCGCCTCACA ACCACCCCAA ACTGCAGCAA CTGTGGCTGA 540 

AGGCGCATTA CGTGGAGGCC GAGAAGCTGC GCGGCCGACC CCTGGGCGCC GTGGGCAAAT 600 

ATCGGGTGCG CCGAAAATTT CCACTGCCGC GCACCATCTG GGACGGCGAG GAGACCAGCT 660 

ACTGCTTCAA GGAGAAGTCG AGGGGTGTCC TGCGGGAGTG GTACGCGCAC AATCCCTACC 720 

_ _ CATCGCCGCG TGAGAAGCGG GAGCTGGCCG AGGCCACCGG CCTCACCACC ACCCAGGTCA 780 

55 GCAACTGGTT TAAGAACCGG AGGCAAAGAG ACCGGGCCGC GGAGGCCAAG GAAAGGGAGA 840 

ACACCGAAAA CAATAACTCC TCCTCCAACA AGCAGAACCA ACTCTCTCCT CTGGAAGGGG 900 

GCAAGOCGCT CATGTCCAGC TCAGAAGAGG AATTCTCACC TCCCCAAAGT CCAGACCAGA 960 

ACTCGGTCCT TCTGCTGCAG GGCAATATGG GCCACGCCAG GAGCTCAAAC TATTCTCTCC 1020 

- A CGGGCTTAAC AGCCTCGCAG CCCAGTCACG GCCTGCAGAC CCACCAGCAT CAGCTCCAAG 1080 

60 ACTCTCTGCT CGGCCCCCTC ACCTCCAGTC TGGTGGACTT GGGGTCCTAA GTGGGGAGGG 1140 

ACTGGGGCCT CGAAGGGATT CCTGGAGCAG CAACCACTGC AGCGACTAGG GACACTTGTA 1200 

AATAGAAATC AGGAACATTT TTGCAGCTTG TTTCTGGAGT TGTTTGCGCA TAAAGGAATG 1260 

GTGGACTTTC ACAAATATCT TTTTAAAAAT CAAAACCAAC AGCGATCTCA AGCTTAATCT 1320 
CCTCTTCTCT CCAACTCTTT CCACTTTTGC ATTTTCCTTC CCAATGCAGA GATCAGGG 



SEQ 10 NO:108 PEW3 Protein secuence 
Protein Accession* NP_005973 



i-. 1 11 21 31 41 51 

70 | | | | | | 

MSMLPSFGFT QEQVACVCEV LQQGGNLERL GRFLWSLPAC DHLHKNESVL KAKAWAFHR 60 

GNFRELYKIL ESHQFSPHNH PKLQQLWLKA HYVEAEKliRG RPLGAVGKYR VRRKFPLPRT 120 

IWDGEETSYC FKEKSRGVLR EWYAHNPYPS PREKRELAEA TGLTTTQVSN WFKNRRQRDR 180 

AAEAKERENT ENNNSSSNKQ NQLSPLEGGK PLMSSSEEEF SPPQSPDQNS VLLLQGNMGH 240 
75 ARSSNYSLPG LTASQPSHGL QTHQHQLQDS LLGPLTSSLV DLGS 



SEQ ID NO:109 PFJ8 DMA SEQUENCE 

Nuclei Acid Accession #: NM.005069 

Coding sequence: 57-2060 (underlined sequences correspond to start and stop codons) 
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1 U 21 31 41 51 
I I I I I I 

GGGGCTCCGC GGGCCTGGAG CACGGCCGGG TCTAATATGC CCGGAGCCGA GGCGCGATQA 60 
AGGAGAAGTC CAAGAATGCG GCCAAGACCA GGAGGG AGAA GGAAAATGGC GAGTTTTACG 120 
AGCTTGOCAA GCTGCTCCCG CTGCCGTCGG CCATCACTTC GCAGCTGGAC AAAGCGTCCA 180 
TCATCCGCCT CACCACGAGC TACCTGAAG A TGCGCGCCGT CTTCCCCGAA GGTTTAGGAG 240 
ACGCGTGGGG ACAGCCGAGC CGCGCCGGGC CCCTGGACGG CGTCGCCAAG GAGCTGGGAT 300 
CGCACTTGCT GCAGACTTTG GATGGATTTG TTTTTGTGGT AGCATCTGAT GGCAAAATCA 360 
TGTATATATC CGAGACCGCT TCTGTCCATT TAGGCTTATC CCAGGTGG AG CTCACGGGCA 420 
ACAGTATTTA TGAATACATC CATCCTTCTG ACCACGATG A GATG ACCGCT GTCCTCACGG 480 
CCCACCAGCC GCTGCACCAC CACCTGCTCC AAG AGTATGA GATAGAGAGG TCGTTCTTTC 540 
TTCGAATGAA ATGTGTCTTG GCGAAAAGGA ACGCGGGCCT GACCTGCAGC GGATACAAGG 600 
TCATCCACTG CAGTGGCTAC TTGAAGATCA GGCAGTATAT GCTGGACATG TCCCTGTACG 660 
ACTCCTCiCTA CCAGATTGTG GGGCTGGTGG CCGTGGGCCA GTCGCTGCCA COCAGTGCCA 720 
TCACCGAG AT CAAGCTGTAC AGTAACATGT TCATGTTCAG GGCCAGCCTT GACCTGAAGC 780 
TGATATTCCT GGATTCCAGG GTGACCGAGG TGACGGGTTA CGAGCCGCAG GACCTGATCG 840 
AGAAGACCCT ATACCATCAC GTGCACGGCT GCG ACGTGTT CCACCTCCGC TACGCACACC 900 
ACCTCCTGTT GGTGAAGGGC CAGGTCACCA CCAAGTACTA CCGGCTGCTG TCCAAGCGGG 960 
GCGGCTGGGT GTGGGTGCAG AGCTACGCCA CCGTGGTGCA CAACAGCCGC TCGTCCCGGC 1020 
COCACTGCAT CGTGAGTGTC AATTATGTAC TCACGGAGAT TGAATACAAG GAACTTCAGC 1080 
TGTCCCTGGA GCAGGTGTCC ACTGCCAAGT CXCAGGACTC CTGGAGGACC GCCTTGTCTA 1 140 
CCTCACAAGA AACTAGG AAA TTAGTGAAAC CCAAAAATAC CAAGATGAAG ACAAAGCTGA 1200 
GAACAAACCC TTACCCCCCA CAGCAATACA GCTCGTTCCA AATGGACAAA CTGGAATGCG 1260 
GCCAGCTCGG AAACTGGAGA GCCAGTCOCC CTGCAAGCGC TGCTGCTCCT CCAGAACTGC 1320 
AGOCCCACTC AGAAAGCAGT GACCTTCTGT ACACGOCATC CTACAGCCTG CCCTTCTCCT 1380 
ACCATTACGG ACACTTCCCT CTGGACTCTC ACGTCTTCAG CAGCAAAAAG CCAATGTTGC 1440 
CGGCCAAGTT CGGGCAGCCC CAAGG ATCCC CTTGTGAGGT GGCACGCTTT TTCCTGAGCA 1500 
CACTGCCAGC CAGCGGTGAA TGCCAGTGGC ATTATGCCAA CCCCCTAGTG CCTAGCAGCT 1560 
CGTCTCCAGC TAAAAATCCT CCAG AGCCAC CGGCGAACAC TGCTAGGCAC AGCCTGGTGC 1620 
CAAGCTACGA AGCGCCCGCC GCCGCCGTGC GCAGGTTCGG CGAGGACACC GCGCCCCCGA 1680 
GCTTCCCG AG CTGCGGCCAC TAOCGCGAGG AGCCCGCGCT GGGCCCGGCC AAAGCCGCCC 1740 
GCCAGGCCGC CCGGGACGGG GCGCGGCTGG CGCTGGCCCG CGCGGCACCC GAGTGCTGCG 1800 
CGCCCCCGAC CCCCGAGGCC CCGGGCGCGC CGGCGCAGCT GCCCTTCGTG CTGCTCAACT 1860 
ACCACCGCGT GCTGGCCCGG CGCGGACCGC TGGGGGGCGC CGCACCCGCC GCCTCCGGCC 1920 
TGGCCTGCGC TCCCGGCGGC CCCGAGGCGG CGACCGGOGC GCTGCGGCTC CGGCACXXGA 1980 
GCCOCGCCGC CACCTCCCCG CCCGGCGCGC CCCTGCCGCA CTACCTGGGC GCCTCGGTCA 2040 
TCATCACCAA CGGGAGGTQA CCCGCTGGCC GCCCGCGCCA GGAGCCTGGA CCCGGCCTCC 2100 
CGGGGCTGCG GCGCCACCG A GCCCGGCAAA TGCGCACGAC CTACATTAAT TTATGCAG AG 2160 
ACAGCTGTTT GAATTGGACC CCGCCGCCG A CTTGCGGATT TCCACCGCGG AGGCCCCGCG 2220 
CGCCGGTGCC GAGGGCCGAG GAGCGCCCGG GTCCGGGCAG GTGACCGCOC GCCTCTGTCC 2280 
TGCGAGGGCC GGTGCGACCC AGTTGCTGGG GGCTTGGTTT CCTCACCTTG AAATCGGGCT 2340 
TCACGCGTCT TGCCTTCTCC CCAACGTTCC ACAACAGTCC CGCTGGGGGA TTGAAGCGGT 2400 
TTCACTCCGC AAATATCCTC CACTTTCAGG AGGGAAAACC CACCCTACCA CAGTCCGCTC 2460 
TTCCAAGTGG ACGGCAGACC TGGGAGGGGA CGCCTGTGTC ACGAGCCCTT TTAGATGCTT 2520 
AGGTGAAGGC AGAAGTG ATG ATTGTAAGTC CCATGAATAC ACAACTCCAC TGTCTTTAAA 2580 
AGTCATTCAA GAGTCTCATT A l l 1 i 1U1 1 1 TTATTTAACC CTTTCTTCAA TACAAAAAGC 2640 
CAACAAACCA AG ACTAAGGG GGTG ACCATG CAATTCCATT TTGTGTCTGT G AACATAGGT 2700 
GTGCTTOCCA AATACATTAA CAAGCTCTTA CTTCCCCCTA ACCCCTATGA ACTCTTGATA 2760 
ACAOCAAG AG TAGCACCTTC AG AATATATT GAATAGGCAT TAAATGCA AA AATATATATG 2820 
TAGCCAGACA GTTTATGAGA ATGACOCTGT CAAGCTTCAT TATTACX5TGG CAAAATGCCT 2880 
CTGGCCCACA CAG ATCTGTA ATTCACTAGG CTCGTGTTTG CTACAAATAG TGCTAATAAA 2940 
GTTAAATTGC ACGTGCAATA CGG AACACTG TCAATGGACT GCACCTTGTG AAGGAAAAAC 3000 
ATGCTTAAGG GGGTGTAATG AAAATGATGT AG ACATTTTA AGCATTTTCT ACACAGCG AG 3060 
AAAACTTCGT AAG AACATGT TACGTGTGCA ACAGGTAAAC AGAAATCCTT TCATAAAGCA 3120 
CCAGCAGTGT TTAAAAAATG AGCTTCCATT AATTTTTACT TTTTATGGGT TTTGCTTAAA 3180 
GATCTCAACA TGG AAAAATC CTGTCATGGC TCTGAACTGC ACAATGCATT GAACCGCCGT 3240 
CCTTCAATTT TCTTCACACT ATCAACACTG CAGCATTTTG CTGCTTTATC AAAATGGTTT 3300 
ATTTTAGGAA ACTTTTTCCA CCTTTCTGAA TGGAAAGAGG TTTTCACAAA TGTTTTAAAC 3360 
TCATCGTTCT AAAATCAAGT GCACCTACAC CAACTGCTCT CAAAATGTG A ACTGACTTTT 3420 
TITTTTTTTTrn I GCCAAC CCTGTGTCAC TTAGTGAGGA CCTGACACAA TCCCTACAGG 3480 
GTGTCTGTCA GTGGGCCTCA TGGTAAGAGT CACAATTTGC AAATTTAGGA CCGTGGGTCA 3540 
TGCAGCGAAG GGGCTGGATG GTAGGAAGGG ATGTGCCCGC CTCTCCACGC ACTCAGCTAT 3600 
AOCTCATTCA CAGCTCCTTG TG AGTGTGTG CACAGG AAAT AAGCCG AGGG TATTATTTTT 3660 
TTATGTTCAT G AGTCTTGTA ATTAAACCGT GATTCTTGAA AGGTGTAGGT TTGATTACTA 3720 
GGAGATACCA CCGACATTTT TCAATAAAGT ACTGCAAAAT GCTTTTGTGT CTACCTTGTT 3780 
ATTAACTTTT GGGGCTGTAT TTAGTAAAAA TAAATCAAGG CTATCGGAGC AGTTCAATAA 3840 
CAAAGGTTAC TGTTGAGAAA AAAGACCCTA TCATAGATTT ACAAG 



SEQ ID NO:110 PFJB Protein sequence; 
Protein Accession*: NPJJ05060.1 

1 11 21 31 41 51 
I I I I I I 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SATTSQLDKA SIIRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK IMYISETASV HLGLSQVELT 120 
GNSIYEYIHP SDHDEMTAVL TAHQPLHHHL LQEYEIERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VAVGQSLPPS ATTEIKLYSN MFMFRASLDL 240 
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KLIFLDSRVT EVTGYEPQDL IEKTLYHH VH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATWHNSRSS RPHCIVS VNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSPQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKPGQPQO SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRFGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VHTNGR 



SEQ ID NO:111 PFJ7 ONA SEQUENCE 

Nucleic Acid Accession*: NM.006549 

Coding sequence: 1 -1254 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

AIGAACGGAC GCTGCATCTG CCCGTCCCTG CCCTACTCAC CCGTCAGCTC CCCGCAGTCC 60 
TCGCCTCGGC TGCCOCGGCG GCCGACAGTG GAGTCTCACC ACGTCTCCAT CACGGGTATG 120 
CAGGACTGTG TGCAGCTGAA TCAGTATACC CTGAAGGATG AAATTGGAAA GGGCTCCTAT 180 
GGTGTCGTCA AGTTGGCCTA CAATGAAAAT G ACAATACCT ACTATGCAAT GAAGGTGCTG 240 
TCCAAAAAGA AGCTG ATOCG GCAGGCCGGC TTTCCACGTC GCOCTCCAOC CCGAGGCACC 300 
CGGCCAGCTC CTGGAGGCTG CATCCAGCCC AGGGGCCCCA TTGAGCAGGT GTACCAGGAA 360 
ATTGOCATGC TCAAGAAGCT GG ACCACCCC AATGTGGTGA AGCTGGTGGA GGTOCTGGAT 420 
GACCCCAATG AGGACCATCT GTACATGGTG TTCGAACTGG TCAACCAAGG GCCCGTGATG 480 
GAAGTGCCCA CCCTCAAACC ACTCTCTGAA GACCAGGCCC GTTTCTACTT CCAGGATCTG 540 
ATCAAAGGCA TCGAGTACTT ACACTACCAG AAGATCATCC ACCGTGACAT CAAACCTTCC 600 
AACCTCCTGG TCGGAGAAGA TGGGCACATC AAGATCGCTG ACTTTGGTGT GAGCAATGAA 660 
TTCAAGGGCA GTGACGCGCT CCTCTCCAAC ACCGTGGGCA CGCCCGCCTT CATGGCACCC 720 
GAGTCGCTCT CTG AGACCCG CA AGATCTTC TCTGGGAAGG OCTTGGATGT TTGGGCC ATG 780 
GGTGTGACAC TATACTGCTT TGTCTTTGGC CAGTGCCCAT TCATGGACGA GOGGATCATG 840 
TGTTTACACA GTAAGATCAA G AGTCAGGCC CTGGAATTTC CAG ACCAGCC CGACATAGCT 900 
GAGGACTTGA AGGACCTGAT CACGXX3TATG CTGGACAAGA ACCCCGAGTC GAGGATCGTG 960 
GTGCCGGAAA TCAAGCTGCA CCCCTGGGTC ACGAGGCATG GGGCGGAGCC GTTGCCGTCG 1020 
GAGGATGAGA ACTGCACGCT GGTCG AAGTG ACTGAAGAGG AGGTCGAGAA CTCAGTCAAA 1080 
CACATTCCCA GCTTGGCAAC CGTGATCCTG GTGAAGACCA TGATACGTAA ACGCTCCTTT 1140 
GGG AACCCAT TCGAGGGCAG CCGGCGGGAG G AACGCTCAC TGTCAGCGCC TGGAAACTTG 1200 
CTCACCAAAA AACCAACCAG GGAATGTGAG TCCCTGTCTG AGCTCAAGAC CJA&AAAATA 1260 
AGTCCCCITC CTGCCTGTTG CAAAGTAACG TAAGAGTTCC CTCACCCGAG TGGATGCAGA 1320 
CX3TTCTTGCTGTCAGCCACC TTCCTTCATA CACATAGCCA GCCCAGGGTG ACCAGAACGT 1380 
CCCAGGACAG ATG AGGCTTT GTGTCCTTAT G AG AGTGGGA GAACCTGGTG GGCAOCCCTG 1440 
GTGCAGGTGC TGTGGTGGGT GGGGACCCCA CTGCCTTTCC CACTGAGCAC ATCATGGCTA 1500 
CCTG ACTTGG TGGGAGTTCC ATTCAGTCAC TTCTGTTTCT TAAACATAGC TTTACTGAGG 1560 
TACAATTCAC ATACCATGTA ATTCAOCCAC GGGAAGTGTA TGATTCAGTG GTTTCTAATA 1620 
CACACTTCTG CAGCCATTAC CACCGTCAAC TTTACG ACAT TTTCATCAGC CCAAG AAGAC 1680 
ACCCTACACT CCTTAGCTGT CCCCATCCAA CTCCCCCACC CCAGTAACCA CTCAGAATAG 1740 
GTATGGATTT GCCTATTCTG GACGTTTCGT ATAAATGGCG TCATACACTA AAAAAAA AAA 1800 
AAAA 



SEQ ID NO:112 PFJ7 Protein sequence: 
Protein Accession*: NP_006540.1 

1 11 21 31 41 51 
I I I I I I 

MNGRQCPSL PYSPVSSPQS SPRLPRRPTV ESHHVSITGM QDCYQLNQYT LKDEIGKGSY 60 
GWKLAYNEN DNTYYAMKVL SKKKURQAG FPRRPPPRGT RPAPGGCIQP RGPIEQVYQE 120 
IAELKKLDHP NWKLVEVLD DPNEDHLYMV FELVNQGPVM EVPTLKPLSE DQARFYFQDL 180 
IKGIEYLHYQ K1IHRDIKPS NLLVGEDGHI JQADFG VSNE FKGSDALLSN TVGTPAFMAP 240 
ESLSETRKIF SGKALDVWAM GVTLYCFVFG QCPFMDERIM CLHSKIKSQA LEFPDQPDIA 300 
EDLKDUTRM LDKNPESRIV VPEIKLHPWV TRHGAEPLPS EDENCTLVEV TEEEVENSVK 360 
HIPSLATVIL VKTMIRKRSF GNPFEGSRRE ERSLSAPGNL LTKKPTRECE SLSEUCT 



SEQ ID N0:113 PFJ6 DNA SEQUENCE 

Nucleic Acid Accession*: NM.Q21810 

Coding sequence: 1429 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

£2£ AAACCTC TGATATGG AC ATGGTCAG AT GTTGAAGGCC AGAGGCCGGC TCTGCTCATC 60 
TGCACAGCTG CAGCAGGACC CACGCAGGGA GTTAAGGGTT ATGGCAAGCC CTTTGAGCCA 120 
AGAAGTGTGA AAAACATACA CTCTACTCCT GCTTACCCAG ATGCCACAAT GCACAGACAA 180 
CTCCTGGCTC CGGTGGAAGG AAGGATGGCA GAGACATTGA ATCAGAAACT CCATGTTGCC 240 
AATGTGCTGG AAGATGACCC CGGCTACCTA CCTCACGTCT ACAGCGAGGA AGGGGAGTGT 300 
GGAGGGGCCC CATCCCTCAG CTCTCTGGCC AGCTTGGAAC AGGAGTTGCA ACCTGATTTG 360 
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CTGGACTCTT TGGGTTCAAA AGCGACTCCG TTTGAGGAAA TATATTCAGA GTCAGGTGTT 420 
CCTTCCTAA 



SEQ ID NO:114 PFJ6 Protein sequence: 
Protein Accession*: NPJ685B2.1 

1 II 21 31 41 51 
I I I I I I 

MKPIiWTWSD VEGQRPALU CTAAAGPTQG VKGYGKPFEP RSVKNIHSTP AYPDATMHRQ 60 
LLAPVEGRMA ETLNQKLHVA NVLEDDPGYL PHVYSEEGEC GG APSLSSLA SLEQELQPDL 120 
LDSLGSKATP FEHYSESGV PS 



SEQ ID N&115 PFJ5 DNA SEQUENCE 

Nucleic Add Accession #: NM JW8361 

Coding sequence: 131-985 (underlined sequences correspond to start and stop cottons) 

1 11 21 31 41 51 
I I I I I I 

CGAATGCAGG CGACTTGCGA GCTGGGAGCG ATTTAAAACG CTTTGGATTC CCCCGGCCTG 60 
GGTGGGGAGA GCG AGCTGGG TGCCCCCTAG ATTCCCCGCC CCCGCACCTC ATGAGCCGAC 120 
CCTCGGCTCC ATGGAGCCCG GCAATTATGC CACCTTGGAT GGAGCCAAGG ATATCGAAGG 180 
CTTGCTGGGA GCGGG AGGGG GGCGGAATCT GGTCGCCCAC TCCCCTCTGA CCAGCCACCC 240 
AGCGGCGCCT ACGCTGATGC CTGCTGTC AA CTATGCCOCC TTGGATCTGC CAGGCTCGGC 300 
GGAGCCGCCA AAGCAATGCC ACCCATGCCC TGGGGTGCCC CAGGGGACGT CCCCAGCTCC 360 
CGTGCCTTAT GGTTACTTTG GAGGCGGGTA CTACTCCTGC CGAGTGTCCC GGAGCTCGCT 420 
GAAACCCTGT GCCCAGGCAG CCACCCTGGC CGCGTACCCC GCGGAGACTC CCACGGCCGG 480 
GGAAGAGTAC CCCAGTCGCC CCACTGAGTT TGCCTTCTAT CCGGGATATC CGGGAACCTA 540 
OCACGCTATG GCCAGTTACC TGG ACGTGTC TGTGGTGCAG ACTCTGGGTG CTCCTGG AG A 600 
ACCGCGACAT GACTCCCTGT TGCCTGTGGA CAGTTACCAG TCTTGGGCTC TCGCTGGTGG 660 
CTGGAACAGC CAGATGTGTT GCCAGGGAGA ACAGAACCCA CCAGGTCCCT TTTGGAAGGC 720 
AGCATTTGCA GACTCCAGCG GGCAGCACCC TCCTGACGCC TGCGCCTTTC GTCGCGGCCG 780 
CAAGAAACGC ATTCCGTACA GCAAGGGGCA GTTGCGGGAG CTGGAGCGGG AGTATGCGGC 840 
TAACAAGTTC ATCACCAAGG ACAAGAGGCG CAAGATCTCG GCAGCCACCA GCCTCTCGGA 900 
GCGCCAGATT ACCATCTGGT TTCAGA ACCG CCGGGTCAAA GAGAAG AAGG TTCTCGCCAA 960 
GGTGAAGAAC AGCGCTACCC CTTAAGAGAT CTCCTTGCCT GGGTGGGAGG AGCGAAAGTG 1020 
GGGGTGTCCT GGGGAGACCA GAAACCTGCC AAGCCCAGGC TGGGGCCAAG GACTCTGCTG 1080 
AGAGGCCCCT AGAG ACAACA CCCTTCCCAG GCCACTGGCT GCTGGACTGT TCCTCAGGAG 1140 
CGGCCTGGGT ACCCAGTATG TGCAGGGAGA CGGAACCCCA TGTGACAGGC CCACTCCACC 1200 
AGGGTTCCCA AAGAACCTGG CCCAGTCATA ATCATTCATC CTCACAGTGG CAATAATCAC 1260 
GATAACCAGT 



SEQ tDNO:116 PFJ5 Protein sequencer 
Protein Accession #: NP_006352.1 

1 11 21 31 41 51 
I I I I I I 

MEPGNYATLD GAKDIEGLLG AGGGRNLVAH SPLTSHPAAP TLMPAVNYAP LDLPGS AEPP 60 
KQCHPCPGVP QGTSPAPVPY GYFGGGYYSC RVSRSSLKPC AQAATLAAYP AETFTAGEEY 120 
PSRPTEFAFY PGYPGTYHAM ASYLDVSVVQ TLGAPGEPRH DSLLPVDSYQ SWALAGGWNS 180 
QMCCQGEQNP PGPFWKAAFA DSSGQHPPDA CAFRRGRKKR IPYSKGQLRE LEREYAANKF 240 
UKDKRRKIS AATSLSERQI TTWFQNRRVK EKKVLAKVKN SATP 



SEQ ID NO:117 PFJ4 DNA SEQUENCE 

Nucleic Add Accession #: NMJJ05628 

Coding sequence: 591-2216 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I 1 I 

GTAACCGCTA CTCCCGGACA CCAGACCACC GCCTTCCGTA CACAGGGGCC CGCATCCCAC 60 
CCTCCCGGAC CTAAGAGCCT GGGTCCCCTG TTTCCGGAGG TCCGCTTCCC GGCCCCCAGA 120 
TTCTGGCATC CCAGCCCTCA GTGTCCAAGA CCCAGGCAGC CCGGGTCCCC GCCTCCCGGA 180 
TCCAGGCGTC CGGGATCTGC GCCACCAG AA CCTAGCCTCC TGCAGACCTC CGCCATCTGG 240 
GGGCACTCAA CCTCCTGGAG CCAAGGGCCC CACGTCCCAC CCAGAGAAAC TCTCGTATTC 300 
CCAGCTCCTA GGGCCAAGGA ACCCGGGCGC TCCGAACTCC CAGCTTTCGG ACATCTGGCA 360 
CACGGGGCAG AGCAGAGAAG CTCAGCGCCC AGCCTGGGGA ATTTAAACAC TCCAGCTTCC 420 
AAGAGCCAAG GAACTTCAGT GCTGTGAACT CACAACTCTA AGGAGCCCTC CAAAGTTCCA 480 
GTCTCCAGGT GCTGTTACTC AACTCAGTCC TAGGAACGTC GGGTCCTGGG AAGGAGCCCA 540 
AGCGCTCCCA GCCAGCTTCC AGGCGCTAAG AA ACCCCGGT GCTTCCCATC ^TGGTGGCCG 600 
ATCCTCCTCG AGACTCCAAG GGGCTCGCAG CGGCGGAGCC CACCGCCAAC GGGGGCCTGG 660 
CGCTGGCCTC CATCGAGG AC CAAGGCGCGG CAGCAGGCGG CTACTGCGGT TCCCGGGACC 720 
AGGTGCGCCG CTGCCTTCGA GCCAACCTGC TTGTGCTGCT GACAGTGGTG GCCGTGGTGG 780 
CCGGCGTGGC GCTGGGACTG GGGGTGTCGG GGGOCGGGGG TGCGCTGGCG TTGGGCCCGG 840 
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AGCGCTTGAG CGCCTTCGTC TTCCCGGGCG AGCTGCTGCT GCGTCTGCTG CGGATGATCA 900 
TCTTGCCGCT GGTGGTGTGC AGCTTGATCG GCGGCGCCGC CAGCCTGQAC CCCGGCGCGC 960 
TCGGCCGTCT GGGCGCCTGG GCGCTGCTCT TTTTCCTGGT CACCACGCTG CTGGCGTCGG 1020 
CGCTCGGAGT GGGCTTGGCG CTGGCTCTGC AGCCGGGCGC CGCCTCCGCC GCCATCAACG 1080 
CCTCCGTGGG AGCCGCGGGC AGTGCOG AAA ATGCCCCCAG CAAGGAGGTG CTCGATTCGT 1 140 
TCCTGG ATCT TGCG AG AAAT ATCTTCCCTT CCAACCTGGT GTCAGCAGCC TTTCGCTCAT 1200 
ACTCTACCAC CTATGAAGAG AGGAATATCA CCGGAACCAG GGTGAAGGTG CCCGTGGGGC 1260 
AGGAGGTGGA GGGGATGAAC ATCCTGGGCT TGGTAGTGTTTGCCATCGTC TTTGGTGTGG 1320 
CGCTGCGGAA GCTGGGGCCT GAAGGGGAGC TGCTTATCCG CTTCTTCAAC TCCTTCAATG 1380 
AGGCCACCAT GGTTCTGGTC TCCTGGATCA TGTGGTACGC CCCTGTGGGC ATCATGTTCC 1440 
TtKTFGGCTGG CAAGATCGTG GAGATGGAGG ATGTGGGTTT ACTCTTTGCC CGCCTTGGCA 1500 
AGTACATTCT GTGCTGCCTG CTGGGTCACG CCATCCATGG GCTCCTGGTA CTGCGCCTCA 1560 
TCTACTTCCT CTTCACCCGG AAAAACGCCT ACCGCTTOCT GTGGGGCATC GTG ACGCCGC 1620 
TGGCCACTGC CTTTGGGACC TCTTGCAGTT CCGCCACGCT GCCGCTG ATG ATGAAGTGCG 1680 
TGG AGGAGAA TAATGGCGTG GCCAAGCACA TCAGCCGTTT CATCCTGCCC ATCGGCGCCA 1740 
CCGTCAACAT GGACGGTGCC GCGCTCTTCC AGTGCGTGGC CGCAGTGTTC ATTGCACAGC 1800 
TCAGCCAGCA GTCCTTGGAC TTCGTAAAGA TCATCACCAT CCTGGTCACG GCCACAGCGT 1860 
CCAGCGTGGG GGCAGCGGGC ATCCCTGCTG GAGGTGTCCT CACTCTGGCC ATCATCCTCG 1920 
AAGCAGTCAA CCTCCCXjGTC GACCATATCT CCTTGATCCT GGCTGTGGAC TGGCTAGTCG 1980 
ACCGGTCCTG TACCGTCCTC AATGTAGAAG GTGACGCTCT GGGGGCAGGA CTCCTCCAAA 2040 
ATTATGTGGA CCGTACGGAG TCGAGAAGCA CAGAGCCTGA GTTGATACAA GTGAAGAGTG 2100 
AGCTGCCCCT GGATCCGCTG CCAGTCCCCA CTGAGGAAGG AAACCCCCTC CTCAAACACT 2160 
ATCGGGGGOC CGCAGGGGAT GCCACGGTCG CCTCTGAGAA GGAATCAGTC ATGTA A ACCC 2220 
CGGGAGGGAC CTTCCCTGCC CTGCTGGGGG TGCTCTTTGG ACACTGGATT ATGAGGAATG 2280 
GATAAATGGA TGAGCTAGGG CTCTGGGGGT CTGCCTGCAC ACTCTGGGGA GCCAGGGGCC 2340 
CCAGCACCCT CCAGGACAGG AG ATCTGGGA TGCCTGGCTG CTGGAGTACA TGTGTTCACA 2400 
AGGGTTACTC CTCAAAACCC CCAGTTCTCA CTCATGTCCC CAACTCAAGG CTAG AAAACA 2460 
GCAAGATGGA GAAATAATGT TCTGCTGCGT CCCCACCGTG ACCTGCCTGG CCTCCCCTGT 2520 
CTCAGGGAGC AGGTCACAGG TCACCATGGG GAATTCTAGC CCCCACTGGG GGGATGTTAC 2580 
AACACCATGC TGGTTATTTT GGCGGCTGTA GTTGTGGGGG GATGTGTGTG TGCACGTGTG 2640 
TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TTCTGTGACCTCCTGTCCCC ATGGTACGTC 2700 
CCACCCTGTC CCCAG ATCCC CTATTCCCTC CACAATAACA GAAACACTCC CAGGGACTCT 2760 
GGGGAGAGGC TGAGGACAAA TAOCTGCTGT CACTCCAGAG GACATTTTTT TTAGCAATAA 2820 
AATTGAGTGT CAACTATTTA AAAAAAAAAA AAAAAA 



SEQ ID N0:118 PFJ4 Protein sequence: 
Protein Accessions NP_005619.1 

1 11 21 31 41 51 
I I I I I I 

MVADPPRDSK GLAAAEPTAN GGLALASIED QGAAAGGYCG SRDQVRRCLR ANLLVIXTW 60 
AWAGVALGL GVSGAGGALA LGPERLS AFV FPGELLLRLL RMHLPLWC SUGGAASLD 120 
PGALGRLGAW ALLFFLVTTL LASALGVGLA LALQPGAASA AINASVGAAG SAENAPSKEV 180 
LDSFLDLARN IFPSNLVS AA ERSYSTTYEE RNITGTR VKV PVGQEVEGMN ILGLWFAIV 240 
PGVALRKLGP EGELURFFN SFNEATMVLV SWIMWYAFVG IMFLVAGKTV EMEDVGLLFA 300 
RLGKYILCCL LGHAIHGLLV LPUYFLFTR KNPYRFLWGI VTPLATAFGT SSSSATLPLM 360 
MKCVEENNGV AKHISRFILP IGATVNMDGA ALFQCVAAVF IAQLSQQSLD FVKCTTILVT 420 
ATASSVGAAG IPAGGVLTLA HLEAVNLPV DHISULAVD WLVDRSCTVL NVEGDALG AG 480 
LLQNYVDRTE SRSTEPELIQ VKSELPLDFL PVPTEEGNPL LKHYRGPAGD ATVASEKESV 540 
M 



SEQ ID K0:1 19 PFJ3 DNA SEQUENCE 

^ete Arid Accession* NMJH6708 

Coding sequence: 88-642 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I 1 1 * ^ 

CTAGTTAAGG CGGCACAGGG CCGAGGCGTA GTGTGGGTGA CTCCTCCGTT CCTTGGGTCC 60 

CGTCGTCTGT GATACTGCAG TTCAGCX TATG GCAGAACCGC AGCCCCXGTC CGGCGGCCTC 120 

ACGGACGAGG COGCCCTCAG TTGCTGCTCC GACGCGGACC CCAGTACCAA GGATTTTCTA 180 

TTGCAGCAGA CCATGCTACG AGTGAAGGAT CCTAAGAAGT CACTGGATTT TTATACTAGA 240 

GTTCTTGG AA TGACGCTAAT CCAAAAATGT GATTTTCCCA TTATG AAGTT TTCACTCTAC 300 

TTCTTGGCTT ATGAGGATAA AAATGACATC CCTAAAGAAA AAGATGAAAA AATAGCCTGG 360 

GCGCTCTCCA GAAAAGCTAC ACTTGAGCTG ACACACAATT GGGGCACTGA AGATGATGCG 420 

ACCCAGAGTT ACCACAATGG CAATTCAG AC CCTCGAGGAT TCGGTCATAT TGGAATTGCT 480 

GTTCCTGATG TATACAGTGC TTGTAAAAGG TTTGAAG AAC TGGGAGTCAA ATTTGTGAAG 540 

AAACCTGATG ATGGTAAAAT GAAAGGCCTG GCATTTATTC AAGATCCTGA TGGCTACTGG 600 

ATTGAAATTT TGAATCCTAA CA AAATGGCA ACCTTAATGT AGT GCTGTG A G AATTCTCCT 660 

TTGAGATTTC AGAAG AAAGG AAACAATGTG ATTCAAGATA TTTACATACC AGAAGCATCT 720 

AGGACTGATG GATCACTGTC CCGATTCAAA TTATTCTTCA GTCCATTTCC CCTTCCTATT 780 

TCAGCTGTTC CTTTTCACCT AACTGTTCAG TC ATTCTGGT TTTCAAGCAG TGCTTTATCT 840 

CATGTCCTTG AATATAGTTG TGTAACTTTA TTTTTTAGGT AATAATTAGA ACAGTTCCCT 900 

TCAGAGGCTG CATTTGCCTT CTTCTGCCAC CTAAATATTA CTTCeCTTCA AATCTGCCTT 960 

TG AATCATCA TTTTTAAAAA AAAA TTAAC A TOTTTTTCTl GTAGTTATCT TCTGGGGTTT 1020 

CAATTCCTCA GAAACAACTT TTTTCACAAC GG AAAGG AAA GAACACTAGT GTTCTTTCAG 1080 

TAAAGTACAA AGTGTTTATT TTACAAAAGA GTAGGTACTC TTGAGAGCAA TTCAAATCAT U40 
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OCTGACAAGG ATACTGATAG AAAAAGTGAT TTCTTCTTAT TATAAAGTAC ATTTAAAGTT 1200 
CAAGGACTAA CCTTATTTAT TTGGGAAAGG GG AGGAGGAA GGAAATG ATA TGGTACOCAG 1260 
ACACTGGGCT AGGCTGCAAC TTTATCTCAT TTAATACTCC CAGCTGTCAT GTG AG AAAG A 1320 
AAGCAGGCTA GGCATGTGAA ATCACTTTCA TGGATTATTA ATGGATTTAA GAGGGCATCA 1380 
ATCAGCTCAA CTCAAG ATTT CATAATCATT TTTAGTATTT AGATTGTGCC TCAAAGTTGT 1440 
AGTACCTCAC AATACCTCCA CTGGTTTCCT GTTGTAAAAA CCTTCAGTGA GTTTGACCAT 1500 
TGTGCTCTTG GCTCTTGGGC TGGAGTACCG TGGTGAGGGA GTAAACACTA GAAGTCTTTA 1560 
GTACAAAACT GCTCTAGGGA CACCTGGTGA TTCCTACACA AGTG ATGTTT ATATTTCTCA 1620 
TAAAG AGTCT TCCCTATCCC AAGGTCTTCA TGATGCCAGT AGCCATATAT GATAAATTAT 1680 
GTTCAGTG AT AACT TAGTTA TCAGAA ATCA GCTCAGTGGT CTTCCCCGCC ATGATTCACA 1740 
TTTGATG AGT TTTTAAAAAT CAAAGTGATT TTGAAAATCT CTAATGGCTC AGAAAATAAA 1800 
AACATCCAGT TTGTGGATGA CTATATTTAG ATTTCTCTAG ACTCTAGTGG AAG ACCTTTG 1860 
GAAAGGCCAT GCCAAOCGTG CTTGTACTGC TAGAAGCACT TTATGTTTCC TTTTTGGGTG 1920 
AAATGGATTT ATGTGAGTGC TTTAAACAAA TAGCAATACT TATAG ACTGA AATAAAATGA 1980 
AACTTCAAAT AAG 



SEQ ID NftiaOpFiBPWteftlWquWff 
Protein Accession t. NP_006699.1 

1 11 21 31 41 51 
I I I I I I 

MAEPQPPSGG LTDEAALSCC SDADPSTKDF ULQQTMLRVK DPKKSLDFYT RVLGMTUQK 60 
CDFPIMKFSL YFLAYEDKND IPKEKDEKIA WALSRKATLE LTHNWGTEDD ATQSYHNGNS 120 
DPRGPGHIGI AVPDVYS ACK RFEELGVKFV KKPDDGKMKG LAFIQDPDGY WIEILNPNKM 180 
ATLM 



SEQ ID NO:121 PFJ2 DNA SEQUENCE 

Nucleic Acid Accession #: NM_002867 

Coding sequence: 70-729 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CCG ACGCCAG GTCCTGCCGT CCCGCCGACC GTCCGGGAGC GAAOCCGTCG TCCOGCACTG 60 
G AGTCC GCG A TG GCTTCAGT GACAGATGGT AAACATGGAG TCAAAGATGC CTCTGACCAG 120 
AATTTTGACT ACATGTTTAA ACTGCTTATC ATTGGCAACA GCAGTGTTCG CAAGACCTCC 180 
TTCCTCTTGC GCTATGCTG A TGACACGTTC ACCCCAGCCT TCGTTAGCAC CGTGGGCATC 240 
GACTTCAAGG TGAAGACAGT CTACCGTCAC GAG AAGCGGG TGAAACTGCA GATCTGGGAC 300 
ACAGCTGGGC AGGAGCGGTA CCGGACCATC ACAACAGCCT ATTACCGTGG GGCCATGGGC 360 
TTCATTCTG A TGTATGACAT CACCAATG AA G AGTCCTTCA ATGCTGTCCA AG ACTGGGCT 420 
ACTCAG ATCA AGACCTACTC CTGGG ACAAT GCACAAGTTA TTCIGGTGGG GAACAAGTGT 480 
GACATGGAGG AAGAGAGGGT TGTTCCCACT GAGAAGGGCC AGCTCCTTGC AGAGCAGCTT 540 
GGGTTTGATT TCTTTGAAGC CAGTGCAAAG GAGAACATCA GTGTAAGGCA GGCCTTTGAG 600 
CGCCTGGTGG ATGCCATTTG TGACAAGATG TCTGATTCGC TGGACACAG A CCCGTOGATG 660 
CTGGGCTCCT CCAAGAACAC GCGTCTCTCG G ACACCCCAC CX5CTGCTGCA GCAGAACTGC 720 
TCATGCTA^C AAGGCCCAOC TTCXnXjACCTCCCCTCATTG TGGCCCCACA CCCAAGTCTG 780 
CTTCTCCCrG tt ACACACTG TCCGCTCT 



SEQ ID NO:122E&a£lPMS£fl«finC£ 
Protein Accession*: NPJM2858.1 

1 11 21 31 41 51 
I I I I I I 

MASVTDGKHG VKD ASDQNFD YMFKLLUGN SSVGKTSFLL RYADDTFTPA FVSTVGIDFK 60 
VKTVYRHEKR VKLQIWDTAG QERYRTITTA YYRGAMGFIL MYDITNEESF NA VQDWATQI 120 
KTYSWDNAQV ILVGNKCDME EERWPTEKG QLLAEQLGFD FFEASAKENI SVRQAFERLV 180 
DAICDKMSDS LDTDPSMLGS SKNTRLSDTP PIXjQQNCSC 



SEQ ID NO:123 PFJ1 DNA SEQUENCE 

Nucleic Acid Accession*: NM.001844 

Coding sequence: 158462t (underilned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
1(1111 

ACGCAG AGCG CTGCTGGGCT GCCGGGTCTC CCGCTTCCTC CTCCTGCTCC AAGGGCCTCC 60 
TGCATGAGGG CGCGGTAGAG ACCCGGACCC GCGCCGTGCT CCTGCCGTTT CX3CTGCGCTC 120 
CGCCCGGGCC CGGCTCAGCC AGGCCCCGCG GTGAGCX2ATQ ATTCGCCTCG GGGCTCCCCA 180 
GTCGCTGGTG CTGCTGACGC TGCTCGTOGC CGCTGTCCTT CXK3TGTCAGG GCCAGGATGT 240 
OCAGGAGGCT GGCAGCTGTG TGCAGGATGG GCAGAGGTAT AATGATAAGG ATGTGTGG AA 300 
GCCGGAGCOC TGCCGGATCT GTGTCTGTG A CACTGGGACT GTOCTCTGCG ACX} ACATAAT 360 
CTGTGAAGAC GTG AAAGACT GCCTCAGCCC TGAGATCCCC TTCGGAGAGT GCTGCCCCAT 420 
CTGCCCAACT GACCTCGCCA CTGCCAGTGG GCAACCAGGA CCAAAGGGAC AGAAAGGAGA 480 
ACCTGGAGAC ATCAAGGATA TTGTAGGACC CAAAGGACCT CCTGGGCCTC AGGGACCTGC 540 
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PCT/USO 1/32045 



AGGGGAACAA GGACCCAGAG GGGATCGTGG TG ACAAAGGT GAAAAAGGTG CCCCTGGACC 600 
TCGTGGCAGA GATGGAGAAC CTGGGACCCC TGGAAATCCT GGCCCCCCTG GTCCTCCCGG 660 
CCCCCCTGGT CCCCCTGGTC TTGGTGGAAA CTTTGCTGCC CAGATGGCTG GAGGATTTG A 720 
TGAAAAGGCT GGTGGCGCCC AGTTGGGAGT AATGCAAGGA CCAATGGGCC CCATGGGACC 780 
5 TCGAGGACCT CCAGGCCCTG CAGGTGCTCC TGGGCCTCAA GGATTTCAAG GCAATCCTGG 840 
TGAACCTGGT GAACCTGGTG TCTCTGGTCC CATGGGTCCC CGTGGTCCTC CTGGTCCCCC 900 
TGGAAAGCCT GGTGATGATG GTGAAGCTGG AAAACCTGGA AAAGCTGGTG AAAGGGGTCC 960 
GCCTGGTCCT CAGGGTGCTC GTGGTTTCCC AGGAACCCCA GGCCTTCCTG GTGTCAAAGG 1020 
TCACAGAGGT TATCCAGGCC TGGACGGTGC TAAGGGAGAG GCGGGTGCTC CTGGTGTGAA 1080 

10 GGGTGAGAGT GGTTCCCCGG GTGAGAACGG ATCTCCGGGC CCAATGGGTC CTCGTGGCCT 1140 
GCCTGGTGAA AGAGGACGGA CTGGCCCTGC TGGCGCTGCG GGTGCCCGAG GCAACGATGG 1200 
TCAGCCAGGC CCCGCAGGTC CTCCGGGTCC TGTCGGTCCT GCTGGTGGTC CTGGCTTCCC 1260 
TGGTGCTCCT GGAGCCAAGG GTGAAGCCGG CCCCACTGGT GCCCGTGGTC CTGAAGGTGC 1320 
TCAAGGTCCT CGCGGTGAAC CTGGTACTCC TGGGTCCCCT GGGCCTGCTG GTGCCTCCGG 1380 

15 TAACCCTGGA ACAGATGGAA TTCCTGGAGC CAAAGGATCT GCTGGTGCTC CTGGCATTGC 1440 
TGGTGCTCCT GGCTTCCCTG GGCCACGGGG TCCTCCTGGC CCTCAAGGTG CAACTGGTCC 1500 
TCTGGGCCCG AAAGGTCAGA CGGGTGAACC TGGTATTGCT GGCTTCAAAG GTGAACAAGG 1560 
CCCCAAGGGA GAACCTGGCC CTGCTGGCCC CCAGGGAGCC CCTGGACCCG CTGGTGAAGA 1620 
AGGCAAGAGA GGTGCCCGTG GAGAGCCTGG TGGCGTTGGG CCCATCGGTC CCCCTGGAGA 1680 

20 AAGAGGTGCT CCCGGAAACC GCGGTTTCCC AGGTCAAGAT GGTCTGGCAG GTCCCAAGGG 1740 
AGCCCCTGGA GAGCGAGGGC CCAGTGGTCT TGCTGGCCCC AAGGGAGCCA ACGGTGACCC 1800 
TGGCCGTCCT GGAGAACCTG GCCTTCCTGG AGCCCGGGGT CTCACTGGCC GCCCTGGTGA 1860 
TGCTGGTCCT CAAGGCAAAG TTGGCCCTTC TGGAGCCCCT GGTGAAGATG GTCGTCCTGG 1920 
ACCTCCAGGT CCTCAGGGGG CTCGTGGGCA GCCTGGTGTC ATGGGTTTCC CTGGCCCCAA 1980 

25 AGGTGCCAAC GGTGAGCCTG GCAAAGCTGG TGAGAAGGGA CTGCCTGGTG CTCCTGGTCT 2040 
GAGGGGTCTT CCTGGCAAAG ATGGTGAGAC AGGTGCTGCA GGACCCCCTG GCCCTGCTGG 2100 
ACCTGCTGGT GAACGAGGCG AGCAGGGTGC TCCTGGGCCA TCTGGGTTCC AGGGACTTCC 2160 
TGGCCCTCCT GGTCCCCCAG GTGAAGGTGG AAAACCAGGT GACCAGGGTG TTCCCGGTGA 2220 
AGCTGGAGCC CCTGGCCTCG TGGGTCCCAG GGGTGAACGA GGTTTCCCAG GTGAACGTGG 2280 

30 CTCTCCCGGT GCCCAGGGCC TCCAGGGTCC CCGTGGCCTC CCCGGCACTC CTGGCACTGA 2340 
TGGTCCCAAA GGTGCATCTG GCCCAGCAGG CCCCCCTGGC GCACAGGGCC CTCCAGGTCT 2400 
TCAGGGAATG CCTGGCGAGA GGGGAGCAGC TGGTATCGCT GGGCCCAAAG GCGACAGGGG 2460 
TGACGTTGGT GAGAAAGGCC CTGAGGGAGC CCCTGGAAAG GATGGTGGAC GAGGCCTGAC 2520 

„ AGGTCCCATT GGCCCCCCTG GCCCAGCTGG TGCTAACGGC GAGAAGGGAG AAGTTGGACC 2580 

35 TGCTGGTCCT GCAGGAAGTG CTGGTGCTCG TGGCGCTCCG GGTGAACGTG GAGAGACTGG 2640 
CCCCCCCGGA CCAGCGGG AT TTGCTGGGCC TCCTGGTGCT GATGGCCAGC CTGGGGCCAA 2700 
GGGTGAGCAA GGAG AGGCCG GCCAGAAAGG CG ATGCTGGT GCCCCTGGTC CTCAGGGCCC 2760 
CTCTGGAGCA CCTGGGCCTC AGGGTCCTAC TGGAGTGACT GGTCCTAAAG GAGCCCG AGG 2820 
TGCCCAAGGC CCCCCGGGAG CCACTGGATT CCCTGGAGCT GCTGGCCGCG TTGGACCCCC 2880 

40 AGGCTCCAAT GGCAACCCTG GACCCCCTGG TCCCCCTGGT CCTTCTGGAA AAGATGGTCC 2940 
CAAAGGTGCT CG AGG AGACA GCGGCCCCCC TGGCCGAGCT GGTGAACCCG GCCTCCAAGG 3000 
TCCTGCTGGA CCCCCTGGCG AGAAGGG AGA GCCTGGAGAT GACGGTCCCT CTGGTGCCGA 3060 
AGGTCCACCA GGTCCCCAGG GTCTGGCTGG TCAGAGAGGC ATCGTCGGTC TGCCTGGGCA 3120 
ACGTGGTGAG AGAGGATTCC CTGGCTTGCC TGGCCCATCG GGTGAGCCCG GCAAGCAGGG 3180 

45 TGCTCCTGGA GCATCTGGAG ACAGAGGTCC TCCTGGCCCC GTGGGTCCTC CTGGCCTGAC 3240 
GGGTCCTGCA GGTGAACCCG GACGAGAGGG AAGCCCCGGT GCTGATGGCC CCCCTGGCAG 3300 
AGATGGCGCT GCTGGAGTCA AGGGTG ATCG TGGTGAGACT GGTGCTGTGG GAGCTCCTGG 3360 
AGCCCCTGGG CCCCCTGGCT CCCCTGGCCC CGCTGGTCCA ACTGGCAAGC AAGGAGACAG 3420 

„ AGGAGAAGCT GGTGCACAAG GCCCCATGGG ACCCTCAGGA CCAGCTGGAG CCCGGGGAAT 3480 

50 CCAGGGTCCT CAAGGCCCCA GAGGTGACAA AGGAGAGGCT GGAGAGCCTG GCGAGAGAGG 3540 
CCTGAAGGG A CACCGTGGCT TCACTGGTCT GCAGGGTCTG CCCGGCCCTC CTGGTCCTTC 3600 
TGGAGACCAA GGTGCTTCTG GTCCTGCTGG TCCTTCTGGC CCTAGAGGTC CTCCTGGCCC 3660 
CGTCGGTCCC TCTGGCAAAG ATGGTGCTAA TGGAATCCCT GGCCCCATTG GGCCTCCTGG 3720 
TCCCCGTGGA CGATCAGGCG AAACCGGTCC TGCTGGTCCT CCTGGAAATC CTGGGCCCCC 3780 

55 TGGTCCTCCA GGTCCCCCTG GCCCTGGCAT CGACATGTCC GCCTTTGCTG GCTTAGGCCC 3840 

GAGAG AGAAG GGCCCCG ACC CCCTGCAGTA CATGCGGGCC GACCAGGCAG CCGGTGGCCT 3900 
GAGACAGCAT GACGCCGAGG TGGATGCCAC ACTCAAGTCC CTCAACAACC AGATTGAGAG 3960 
CATCCGCAGC CCCGAGGGCT CCCGCAAGA A CCCTGCTCGC ACCTGCAGAG ACCTGAAACT 4020 
_ CTGCCACCCT GAGTGGAAGA GTGGAGACTA CTGGATTGAC CCCAACCAAG GCTGCACCTT 4080 

60 GGACGCCATG AAGGTTTTCT GCAACATGGA GACTGGCGAG ACTTGCGTCT ACCCCAATCC 4140 

AGCAAACGTT CCCAAGAAGA ACTGGTGGAG CAGCAAGAGC AAGGAGAAGA AACACATCTG 4200 
GTTTGGAGAA ACCATCAATG GTGGCTTCCA TTTCAGCTAT GGAGATGACA ATCTGGCTCC 4260 
CAACACTGCC AACGTCCAGA TGACCTTCCT ACGCCTGCTG TCCACGGAAG GCTCCCAGAA 4320 
. CATCACCTAC CACTGCAAGA ACAGCATTGC CTATCTGGAC GAAGCAGCTG GCAACCTCAA 4380 

65 GAAGGCCCTG CTCATCCAGG GCTCCAATG A CGTGGAGATC CGGGCAG AGG GCAATAGCAG 4440 
GTTCACGTAC ACTGCCCTGA AGGATGGCTG CACGAAACAT ACCGGTAAGT GGGGCAAGAC 4500 
TGTTATCGAG TACCGGTC AC AG AAGACCTC ACGCCTCCCC ATCATTGACA TTGCACCCAT 4560 
GGACATAGG A GGGCCCGAGC AGGAATTCGG TGTGGACATA GGGCCGGTCT GCTTCTTGTA 4620 
AAAACCTGAA CCCAGAAACA ACACAATCCG TTGCAAACCC AAAGGACCCA AGTACTTTCC 4680 

70 AATCTCAGTC ACTCTAGGAC TCTGCACTGA ATGGCTG ACC TGACCTGATG TCCATTCATC 4740 
CCACCCTCTC ACAGTTCGGA CTTTTCTCCC CTCTCTTTCT AAGAGACCTG AACTGGGCAG 4800 
ACTGCAAAAT A AAATCTCGG TGTTCTATTT ATTTATTGTC TTCCTGTAAG ACCTTCGGGT 4860 
CAAGGCAG AG GCAGG AAACT AACTGGTGTG AGTCAAATGC CCCCTGAGTG ACTGCCCCCA 4920 

_ „ GCCCAGGCC A GAAGACCTCC CTTCAGGTGC CGGGCGCAGG AACTGTGTGT GTCCTACACA 4980 

75 ATGGTGCTAT TCTGTGTCAA ACACCTCTGT Al 111 11 AAA ACATCAATTG ATATTAAAAA 5040 
TGAAAAGATT ATTGGAAAGT 



SEQ ID NO:124 PFJ 1 Protein sequent*! 
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Protein Accession* NPJXJ183&2 

I 11 21 31 41 51 
I I 1 I I I 

MIRLG APQSL VLLTLLVAAV LRCQGQDVQE AGSCVQDGQR YNDKDVWKPE PCRICVCDTG 60 
TVLCDDHCE DVKDCLSPEI PFGECCPICP TDLATASGQP GPKGQKGEPG DIKDIVGPKG 120 
PPGPQGPAGE QGPRGDRGDK GEKGAPGPRG RDGEPGTPGN PGPPGPPGPP GPPGLGGNFA 180 
AQMAGGFDEK AGGAQLGVMQ GPMGPMGPRG PPGPAGAPGP QGPQGNPGEP GEPG VSGPMG 240 
PRGPPGPPGK PGDDGEAGKP GKAGERGPPG PQG ARGFPGT PGLPGVKGHR GYPGLDGAKG 300 
EAGAPGVKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP GPAGPPGPVG 360 
PAGGPGFPGA PG AKGEAGPT GARGPEGAQG PRGEPGTPGS PGPAGASGNP GTDGIPGAKG 420 
SAGAPGIAGA PGFPGPRGPP GPQG ATGPLG PKGQTGEPGI AGFKGEQGPK GEpGPAGPQG 480 
APGPAGEEGK RGARGEPGGV GP1GPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG 540 
PKGANGDPGR PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG 600 
VMGFPGPKGA NGEPGKAGEK GLPGAPGLRG LPGKDGETGA AGPPGPAGPA GERGEQGAPG 660 
PSGPQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE RGFPGERGSP GAQGLQGPRG 720 
LPGTPGTDGP KGASGPAGPP GAQGPPGLQG MPGERGAAGI AGPKGDRGDV GEKGPEGAPG 780 
KDGGRGLTGP IGPPGPAGAN GEKGEVGPPG PAGS AGARGA PGERGETGPP GPAGFAGPPG 840 
ADGQPGAKGE QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKGARGAQ GPPGATGFPG 900 
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA GPPGEKGEPG 960 
DDGPSGAEGP PGPQGLAGQR GIVGLPGQRG ERGFPGLPGP SGEPGKQGAP GASGDRGPPG 1020 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG AAGVKGDRGE TGAVGAPGAP GPPGSPGPAG 1080 
PTGKQGDRGE AGAQGPMGPS GPAGARGIQG PQGPRGDKGE AGEPGERGLK GHRGFTGLQG 1 140 
LPGPPGPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGPIGPPGPR GRSGETGPAG 1200 
PPGNPGPPGP PGPPGPGIDM SAFAGLGPRE KGPDPLQYMR ADQAAGGLRQ HDAEVDATLK 1260 
SLNNQDBS1R SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 1320 
ETCVYPNPAN VPKKNWWSSK SKEKKHIWFG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 1380 
LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA LLIQGSNDVE IRAEGNSRFT YTALKDGCTK 1440 
HTGKWGKTVI EYRSQKTSRL PIIDIAPMDI GGPEQEFGVD IGPVCFL 



SEQ ID NO:125 PFH9 DNA SEQUENCE 

Nucleic Acid Accession #: NM.005084 

Coding sequence: 162-1 487{undertined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

1 i ! I I I 

GCTGGTCGGA GGCTCGCAGT GCTGTCGGCG AGAAGCAGTC GGGTTTGGAG CGCTTGGGTC 60 
GCGTTGGTGC GCGGTGGAAC GCGCCCAGGG ACCCCAGTTC CCGCGAGCAG CTCCGCGCCG 120 
CGCCTGAGAG ACTAAGCTGA AACTGCTGCT CAGCTCCCAA GATGGTGCCA CCCAAATTGC 180 
ATGTGCTTTT CTGCCTCTGC GGCTGCCTGG CTGTGGTTTA TCCTTTTGAC TGGCAATACA 240 
TAAATCCTGT TGCCCATATG AAATCATCAG CATGGGTCAA CAAAATACAA GTACTGATGG 300 
CTGCTGCAAG CTTTGGCCAA ACTAAAATCC CCCGGGG AAA TGGGCCTTAT TCCGTTGGTT 360 
GTACAGACTT AATGTTTG AT CACACTAATA AGGGCACCTT CTTGCGTTTA TATTATCCAT 420 
COCAAGATAA TGATCGCCTT G ACACCCTTT GGATCCCAAA TAAAG AATAT TTTTGGGGTC 480 
TTAGCAAATT TCTTGG AACA CACTGGCTTA TGGGCAACAT TTTGAGGTTA CTCTTTGGTT 540 
CAATGACAAC TCCTGCAAAC TCK3AATTCCC CTCTGAGGCC TGGTGAAAAA TATCCACTTG 600 
TTGTTTTTTC TCATGGTCTT GGGGCATTCA GGACACTTTA TTCTGCTATT GGCATTGACC 660 
TGGCATCTCA TGGGTTTATA GTTGCTGCTG TAGAACACAG AGATAGATCT GCATCTGCAA 720 
CTTACTATTT CAAGG ACCAA TCTGCTGCAG AAATAGGGGA CAAGTCTTGG CTCTACCTTA 780 
GAAOCCTGAA ACAAGAGG AG G AG ACACATA TACGAAATG A GCAGGTACGG CAAAG AGCAA 840 
AAGAATGTTC CCAAGCTCTC AGTCTGATTC TTGACATTGA TCATGGAAAG CCAGTGAAGA 900 
ATGCATTAGA TTTAAAGTTT GATATGGAAC AACTGAAGGA CTCTATTGAT AGGGAAAAAA 960 
TAGCAGTAAT TGGACATTCT TTTGGTGGAG CAAOGGTTAT TCAGACTCTT AGTGAAG ATC 1020 
AG AGATTCAG ATGTGGTATT GCCCTGGATG CATGGATGTT TCCACTGGGT GATGAAGTAT 1080 
ATTCCAG AAT TCCTCAGCCC CTCTTTTTTA TCAACTCTG A ATATTTCCAA TATCCTGCT A 1140 
ATATCATAAA AATGAAAAAA TGCTACTCAC CTGATAAAGA AAGAAAGATG ATTACAATCA 1200 
GGGGTTCAGT CCACCAG AAT TTTGCTG ACT TCACTTTTGC AACTGGCAAA ATAATTGG AC 1260 
ACATGCTCAA ATTAAAGGG A G ACATAGATT CAAATGTAGC TATTG ATCTT AGCAACAAAG 1320 
CTTCATTAGC ATTCTTACAA AAGCATTTAG GACTTCATAA AGATTTTGAT CAGTGGGACT 1380 
GCTTGATTGA AGGAGATGAT GAGAATCTTA TTCCAGGGAC CAACATTAAC ACAACCAATC 1440 
AACACATCAT GTTACAGAAC TCTTCAGGAA TAG AGAAATA CAATT^gGAT TAAAATAGGT 1500 
TTTTT 



SEQ tO NO:126 PFH9 Protein sequence: 
Protein Accession* NPJJ05075.1 

1 11 21 31 41 51 

MVPPK1-HVLF clcgclav VY PFDWQYINPV ahmkss AWVN KIQVLMAAAS PGQTKIPRGN 60 
GPYSVGCTDL MFDHTNKGTF LRLYYPSQDN DRLDTLWIPN KEYFWGLSKF LGTHWLMGNI 120 
LRIXFGSMTT PANWNSPLRP GEKYPLVVFS HGLGAFRTLY S AIGIDLASH GFTVAAVEHR 180 
DRSASATYYF KDQS AAEIGD KSWLYLRTLK QEEETHIRNE QVRQRAKECS QALSLILDID 240 
HGKPVKNALD LKFDMEQLKD SIDREKIAVI GHSFGGATVI QTLSEDQRFR CGIALDAWMF 300 
PUGDEVYSRI PQPLFFINSE YFQYPANIIK MKKCYSPDKE RKMITIRGSV HQNFADFTFA 360 
TGKHGHMLK LKGDIDSN VA IDLSNKASLA FLQKHLGLHK DFDQWDCUE GDDENLIPGT 420 
NINTTNQHIM LQNSSGEKY N 
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SEQ ID NO:127 PFH8 DNA SEQUENCE 

Nucleic Add Accession*: NMJJ15300 

Coding sequence: 32-1402 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CACGAGCGGC ACGAGGATTT CCAGCTCAGC GATGCCCCCA GGTCCCTGGG AGAGCTGCTT 60 
CTGGGTGGGG GGCCTCATTT TGTGGCTCAG CGTTGGAAGT TCAGGGGATG CACCTCCTAC 120 
CCCACAGCCA AAGTGCGCTG ACTTCCAGAG CG CC AACCTT TTTGAAGGCA CCGATCTCAA 180 
AGTCCAGTTT CTCCTCTTTG TCCCTTCGAA TCCTAGCTGT GGGCAGCTAG TAGAAGGAAG 240 
CAGTGACCTC CAAAACTCTG GGTTCAATGC CACTCTGGGA ACCAAACTAA TTATOCATGG 300 
ATTCAGGGTT TTAGG AACAA AGOCTTCCTG GATTGACACA TTTATTAGAA CXXTTCTGCG 360 
TGCAACGAAT GCTAATGTGA TTGCCGTGGA CTGGATTTAT GGGTCTACAG GAGTCTACTT 420 
CTCAGCTGTG AAAAATGTGA TTAAGTTGAG CCTCG AGATC TCCCTTTTCC TCAATAAACT 480 
CCTGGTGCTG GGTGTGTCGG AATCCTCAAT CCACATCATT GGTGTTAGCC TGGGGGCCCA 540 
CGTTGGGGGC ATGGTGGGAC AGCTCTTOGG AGGCCAGCTG GGACAGATCA CAGGCCTGGA 600 
CCCCGCTGGA CCTGAGTACA CCAGGGCCAG TGTGGAAGAG CGCTTGGATG CTGGAGATGC 660 
CCTCTTCGTG GAAGCCATCC ACACAGACAC CGACAATTTG GGTATTCGGA TTCCCGTTGG 720 
ACATGTGGAC TACTTCGTCA ACGGAGGCCA AGACCAACCT GGCTGCCCCA CCTTCTTTTA 780 
CGCAGGTTAT AGTTATCTGA TCTGTGATCA CATG AGGGCT GTGCACCTCT ACATCAGCGC 840 
CCTGGAGAAT TCCTGTCCAC TGATGGCCTT TCCCTGTGCC AGCTACAAGG CCTTCCTTCC 900 
TGG ACGCTGT CTGG ATTGCT TTAACOCTTT TCTGCTTTCC TGCCCAAGGA TAGGACTGGT 960 
GGAACAAGGT GGTGTCAAGA TAGAGCCGCT CCCCAAGGAA GTGAAAGTCT ACCTCCTGAC 1020 
TACTTCCAGT GCTCCGTACT GCATGCATCA CAGCCTCGTG GAGTTTCACT TGAAGGAACT 1080 
G AGAAACAAG GACACCAACA TCG AGGTTAC CTTCCTTAGC AGTAACATCA CCTCTTCATC 1140 
TAAGATCACC ATACCTAAGC AGCAACGCTA TGGGAAAGG A ATCATAGCCC ATGCCACCCC 1200 
ACAATGCCAG ATAAACCAAG TGAAATTCAA GTTTCAGTCT TCCAACCGAG TTTGGAAAAA 1260 
AGACCGGACT ACCATTATTG GGAAGTTCTG CACTGCCCTT TTGCCTGTCA ATGACAGAG A 1320 
AAAGATGGTC TGCTTACCTG AACCAGTGAA CTTACAAGCA AGTGTGACTG TTTCCTGTGA 1380 
CCTGAAGATA GCCTGTGT GT AG TTTAACCT GGGCAGGACA CATCTCCCTG CATTTTTTTT 1440 
TTTTTTTTTT GAGAGAGAGG TGTGATGAGG GATGTGTGTG TGCAGCTTAT TGTAGACCAT 1500 
TACTACTAAG GAGAAAAGCA AAGCTCTTTC TTATTTTCCT CATAATCAGC TACCCTGGAG 1560 
GGGAGGGAGA ACTCATTTTA CAGAACTTGG TTTCCTTTGC CX3ATCTTATG TACATACCCA 1620 
TTTTAGCTTT CCCATGCATA CTTAACTGCA CTTGCTTTAT CTCCTTGGGC ATTCGTACTT 1680 
AGGATTCAAT AGAAACATGT ACAGGGTAAA CAATTTTTTA AAAATAAAAC TTCATGGAGT 1740 
AAAAAAAAAA AAAAAAA A 



SEQ ID NO:128 PFHB Protein sequence: 
Protein Accession*: NP_056984.1 

1 11 21 31 41 SI 
I I I I I I 

MPPGPWESCF WVGGLXLWLS VGSSGDAPPT PQPKCADPQS ANLFEGTDLK VQF1XFVPSN 60 
PSCGQLVEGS SDLQNSGFNA TU3TKLUHG FRVLGTKPSW IDTFIRTLLR ATNANVIAVD 120 
WIYGSTGVYF SAVKNVIKLS LEISLFLNKL LVLGVSESSI HIIGVSLGAH VGGMVGQLFG 180 
GQLGQITGLD PAGPEYTRAS VEERLDAGDA LFVEAIHTDT DNLGIRIPVG HVDYFVNGGQ 240 
DQFGCPTFFY AGYSYLICDH MRAVHLYIS A LENSCPLMAF PCAS YKAFLA GRCLDCFNPF 300 
IXSCPRIGLV EQGGVKIEPL PKEVKVYLLT TSSAPYCMHH SLVEFHLKEL RNKDTNIEVT 360 
FLSSNITSSS miPKQQRY GKGHAHATP QCQINQVKFK FQSSNRVWKK DRTTIIGKFC 420 
TAULPVNDRE KMVCLPEPVN LQASVTVSCD LKIACV 



SEQ ID NO:129 PFH7 DNA SEQUENCE 

Nucleic Acid Accession* NMJ314384 

Coding sequence: 69-1338 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

CGTTGCCGGG TCGCAGGTCC CGCCAGTGCG AGCGCAACGG AGGTCGAAGG CGTTCAGACT 60 
CTTAGCTGAA CGCGGAGCTG CGGCGGC TAT GC TGTGGAGC GGCTGCCGGC GTTTCGGGGC 120 
GCGCCTCGGC TGCCTGCCCG GCGGTCTCCG GGTOCTCGTC CAGACCGGCC ACCGGAGCTT 180 
GACCTCCTGC ATCGACCCTT CCATGGGACT TAATGAAGAG CAGAAAGAAT TTCAAAAAGT 240 
GGCCTTTGAC TTTGCTGCCC G AGAGATGGC TCCAAATATG GCAGAGTX3GG ACCAGAAGGA 300 
GCTGTTCCCA GTGGATGTGA TGCGGAAGGC AGCCCAGCTA GGCTTCGGAG GGGTCTACAT 360 
ACAAACAGAT GTGGGCGGGT CTGGGCTGTC ACGTCTTGAT ACCTCTGTCA TTTTTGAAGC 420 
CTTGGCTACA GGCTGCACCA GCACCACAGC CTATATAAGC ATCCA CAACA TGTGTGCCTG 480 
GATGATTGAT AGCTTCGGAA ATGAGGAACA GAGGCACAAA TTTTGCCCAC CGCTCTGTAC 540 
CATGGAGAAG TTTGCTTCCT ACTGCCTCAC TGAACCAGGA AGTGGG AGTG ATGCTGCCTC 600 
TCTTCTGACC TCCGCTAAGA AACAGGGAGA TCATTACATC CTCAATGGCT CCAAGGCCTT 660 
CATCAGTGGT GCTGGTGAGT CAGACATCTA TGTGGTCATG TGCCGAACAG GAGG ACCAGG 720 
CCCCAAGGGC ATCTCATGCA TAGTTGTTG A GAAGGGGACC CCTGGCCTCA GCTTTGGCAA 780 
GAAGGAG AAA AAGGTGGGGT GGAACTCCCA GCCAACACGA GCTGTGATCT TCGAAGACTG 840 
TGCTGTCCCT GTGGCCAACA GAATTGGG AG CX3 AGGGGCAG GGCTTCCTCA TTGCCGTG AG 900 
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AGGACTG AAC GG AGGG AGG A TCAATATTGC TTOCTGCTCC CTGGGGGCTG CCCACGCCTC 960 
TGTCATCCTC ACCCG AGACC ACCTCAATGT CCGGAAGCAG TTTGGAGAGC CTCTGGCCAG 1020 
TAACCAGTAC TTGCAATTCA CACTGGCTGA TATGGCAACA AGGCTGGTGG CCGCGCGGCT 1080 
GATGGTCCGC AATGCAGCAG TGGCTCTQCA GGAQGAGAGG AAGGATGCAG TGGCCTTGTG 1 140 
CTCCATGGCC AAGCTCTTTG CTACAGATG A ATGCTTTGCC ATCTGCAACC AGGOCTTGCA 1200 
GATGCACGGG GGCTACGGCT ACCTGAAGGA TTACGCTGTT CAGCAGTACG TGCGGGACTC 1260 
CAGGGTCCAC CAGATTCTAG AAGGTAGCAA TGAAGTG ATG AGGATACTGA TCTCTAGAAG 1320 
CCTGCTTCAG GAGTAGAACC CACACTTGTT CTGGCCTGGT GTTCAGTGCG ACTGCAGTCA 1380 
GTGTTGAGTG GTGCCATGTG GGCCGCTCTA TTCCAAAGGA ATCATGGATT AGACCCAAGG 1440 
GCTG AGCTCC TCTAGGGCAG GACCTGCACC CTGTGTGTTG GCACCAGCAT CGGGTCTTGG 1500 
ACTGGGGCAG AATCCCCAGT GGAACCGGAA GAGCTGGACTGATGAGAAAC ATCAGAAGAA 1560 
CACATACTAC CTTGTTTTCC TAATGCCAGA AGGGTGACCA GTGAAGA7TC ACCGTCAAAC 1620 
CATGAAAGTC CTTTCTTGG A TCCACTTTAT CTTGATTAGT CTGCATTTTA CTAGTTCACT 1680 
GG ATCCCTCC TCTAGGGGCC TGGGG ACTTT CACTGATGCT CTTCCTG ATT CTAG AGCAAA 1740 
GGTGTGGGAA GGGGAAATGG AGG AATGCCC TCCTGTCTGT GTCGTTCTCT GTGCCACAGC 1800 
TACAGATGCA GAAGGTTTCT CTGGATAGCA CACCTCTGAA TGTAAATCAT GATAAAATGG 1860 
ATATTTGGAA ACTTACTCCT AAGCTGTGAT GTAGGGTGTA TTTCTACTTC TGGACTGCCT 1920 
CAATATCAAG GGCTGAGACT TTTGAATGTT GAATATTCGT TGGGTTTCAT GTTAAG ACGC 1980 
CTGTGGTCCA GGAGTGCTAT TCAGTGTTTC TGTTCCTGAT AAACACTTTG AATA'l 11 11 1 2040 
TGTGTTTTTG TTTCCTTTTC TGAAGCTGTT CCTCCTTTTA AATATTTTTA ATCACATTGA 2100 
TAAAATCTAT CCTTCATCCA CCTCTGGTTC TACTATAGTT GATTTTTATT TTAAATGTTT 2160 
AATTGTATTT GATTAAACAC TTAACTGGAT TTTGGAATAA TAAAACTCTC GTCCAATTTG 2220 
GCTTTTAAAA AAAAAAAA 



SEQ ID NO:130 PFH7 Protein sequence: 
Protein Accession* NP_055199.1 



1 11 21 31 41 51 
I I I I I I 

MLWSGCRRPG ARLGCLPGGL RVLVQTGHRS LTSCIDPSMG LNEEQKEFQK V AFDFAAREM 60 
APNMAEWDQK ELFPVDVMRK AAQLGFGGVY IQTDVGGSGL SRLDTSVIFE ALATGCTSTT 120 
AYISIHNMCA WMIDSPGNEE QRHKFCPPLC TMEKFASYCL TEPGSGSDAA SLLTSAKKQG 180 
DHYELNGSKA FISGAGESDI YWMCRTGGP GPKGISOW EKGTPGLSFG KKEKKVGWNS 240 
QPTRAVIFED CAVPVANRIG SEGQGFLIAV RGLNGGR1NI ASCSLGAAHA SVILTRDHLN 300 
VRKQFGEPLA SNQYLQFTLA DMATRLVAAR LMVRNAA VAL QEERKDAVAL CSMAKLFATD 360 
ECFAICNQAL QMHGGYGYLK DYAVQQYVRD SRVHQILEGS NEVMRHJSR SLLQE 



SEQ ID N0:131 PFH6 DNA SEQUENCE 

Nucleic Add Accession #: NMJJ13S89 

Coding sequence: 707-1 1 05{undertmed sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCCTGCAGAG AG AGGCACTT TGCACCACAG ACAG ATAGCA AGAAGGGAAA GACAGAGAGT 60 
GAGAAAAAAG AGGAGTCAGT CGCTCCTGGG GAAGGGAGAG AGTGAGACTG GGAGAAAGAG 120 
AAGCACAGAA AGTGTGTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TACCCTTAAA 180 
GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAG AAAGA AACAGGCTAC GTTTAAAG AG 240 
CATAGAGACA ATGAAAGGCT AAAG AAAATT TTAAAATCTC TGCCACAGTC TCATAGGTGC 300 
TTGGAAATGA AAGTAGAACT GCCTGTCTTT AACGGACT CT GACAGAGGTA ACTGGATTAG 360 
GGACGAGTAC GGCAGCTTTT TTTTTTTTTT T1TTTTTTTT TTTAACATCT TAAATCCTG A 420 
AAAAAAAAAA AAAAAAAAAA AAAAGGCAGC AGCTCCGAAT TGAATG AATT GATGGGCACA 480 
CTCCAACTGC TGGGCTGGAG AGACTGGACT TAGTCTTGCC ATTTCTGCTT CTTTGAAAGA 540 
GGAG ACAACT TGGGCTTCCT TTTAATTTAG TTTTTTTTCC CCTTCTCOCC CAACCXCCAA 600 
CCTTCCCCCT TACCTCCCCC ACCCCCnTA TCACCACCCC CCTTTTAAAT AAGAGGGTGA 660 
AGGGGAACCA GAGCGCACAA GGGAACTGAC TCAGGAGGCA GAGAAGATGG GCATCCTCAG 720 
CGTAG ACTTG CTG ATCACAC TGCAAATTCT GCCAGTTTTT TTCTCCAACT GCCTCTTCCT 780 
GGCTCTCTAT GACTCGGTCA TTCTGCTCAA GCACGTGGTG CTGCTGTTGA GCCGCTCCAA 840 
GTCCACTCGC GGAGAGTGGC GGCGCATGCT GACCTCAGAG GGACTGCGCT GCGTCTGGAA 900 
GAGCTTCCTC CTCGATGCCT ACAAACAGGT GAAATTGGGT GAGGATGCCC CCAATTCCAG 960 
TGTGGTGCAT GTCTCCAGTA CAGAAGGAGG TGACAACAGT GGCAATGGTA CCCAGG AGAA 1020 
GATAGCTGAG GGAGCCACAT GCCACCTTCT TGACTTTGCC AGCCCTGAGC GCCCACTAGT 1080 
GGTCAACTTT GGCTCAGCCA CTTGACCTCC TTTCACG AGC CAGCTGCCAG CCTTCCGCAA 1140 
ACTGGTGGAA GAGTTCTCCT CAGTGGCTGA CTTCCTGCTG GTCTACATTG ATGAGGCTCA 1200 
TCCATCAGAT GGCTGGGCGA TACCGGGGGA CTCCTCTTTG TCTTTTGAGG TGAAGAAGCA 1260 
CCAGAACCAG GAAGATCG AT GTGCAGCAGC CCAGCAGCTT CTGGAGCGTT TCTCCTTGCC 1320 
GCCCCAGTGC GGAGTTGTGG CTGACCGCAT GG ACAATAAC GCCAACATAG CTTACGGGGT 1380 
AGCXTTTTGAA OGTGTGTGCA TTGTGCAGAG ACAGAAAATT GCTTATCTGG GAGGAAAGGG 1440 
CCCCTTCTCC TACAACCTTC AAGAAGTOOG GCATTGGCTG GAGAAGAATT TCAGCAAGAG 1500 
ATGAAAGAAA ACTAGATTAG CTGGTTAAAG GTATGATTAT AAG AGAGCTT ATTGTTTTAA 1560 
AAAGTTATAT AAAGGCAAGG AAATTAAGAA CTGAATCCAT ATTTCAACAG AGCCCTATTG 1620 
GCTTACTGAA AGACAGGAGT TTATCTATCG GAAGAACATG AATCTCTAAC AGCTCCATAC 1680 
TTCTTTCACT ACTCAAATGG CATTGGGCTG AGTAAGTAAC CATATCACCT CTCTTCTTAG 1740 
TAAAAAGCCC TATGTGAAAA GATCCCAAGA TGGAGAGGAA O AAACGCTAA TTCAGCATGT 1800 
GTTCATTCTG CATTGAGAAG GAACTGATAC ATCTGATGCA TGCTTTGAGA CCAGAAGAAA 1860 
AGACTTACCT GAATAATTAC TACATTAGGG AAGCTACTGT CTACGTTAAG ATAAAGGGTA 1920 
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TTGCCTTGGC TCTATTTGGC ATGGATGGAG CCCAGTTGGA AAATTCCCAA ATATTACAAC 1980 
AAGTCCTTGA AOCCAGGCCA TGTGGTTAGA CGTTGGTGTT AAGGTTAGAC CTTATGTTAG 2040 
AGTCATTTCT GATGTTCCAG CTTCTAGCCA TGTAGTGCTC TCAGTCTTCA TACCCCAGAA 2100 
ATTATTGGTA TATTTGTAGA TACCGAGAAT GATCCCTCAG TCTG AG AGGT TAGAATGATC 2160 
ATCTGTAATC TGAGGGTTAA TTTCTAGGCA GGTGGAGAGA GTGGTAAAAA AGAAATGAAA 2220 
TTGACAAGCT AGGAAAGAGG AGGCAGAAAG ATTTGGAAAA TTCACAGAGT TTCACCCTTA 2280 
AGCTGTAGAG AGTGGGTCAC ATTTGTTAGC CACGGAAACA TAGAAACATA CACAAGGCCA 2340 
GAAAAAGAAG AAGGAGCTCA ACTAAAAGTG GCATAG AGAA TACACATATA AAAACAATAT 2400 
ATTTGTCATA TGCTCCTAG A GAGGAGAAAG GGGTGATTGA AAG AAAAAAA AATACTTAAA 2460 
TATTTGTAAT TGTG AGGGGT TTCTTTTGG A AATAATTACT TTTGAACCAT GTATGTGGTA 2520 
TGTATATTTT CAGTGGGTTA ATTATACCCC ATG ATACCTA TTAAAGGAAA ACCAGTGGGT 2580 
CTGGTGGTGC TGGTCTTTTC CICCCCATTC CTACAATTTC TATGTGGCCC AAGTCATTOC 2640 
TAATCTTGGT CTCTATAGCA GTGTTCTCTC TGAATGCTG A GCTGAAG AAA TTATACGTAC 2700 
ATACACACAT ACATACATAC ATACA AATAT ATGTATATAT ATTCTCAGCT GCTGCGGG AG 2760 
GTAGGTACC A TGGCCATTCA GCACAGCCTT GATTTCCTCC CAAAGTAGGT GAGCTATAGT 2820 
GAAGAATAGG TGCAAACAAA CAAGCTTACT TCCATTGCAA AATAG AAGAA GAGGAAGTTA 2880 
G AG ATAATTC TGATCAATCA TTTTGG AGGC TTTGTTATAA GGCAACCCCC GGTATATCAT 2940 
GG AATTTCCA TTG ACATTTG AATTTGGACT TGG ATCTTCC CTTGGTCCCA TTAGCTG AGG 3000 
TTTAGTAATC TAAAGTCCCT ATAGTATATG ATTATAATGC TATTTTAAAA AATATATATA 3060 
TAAAATATTT TTTTCTTTTT AAAATAG ACA CTATAGTTTT ACCCATAAGT AATATTTAAA 3120 
GATTATAGCT CCCAAAAGAA TGGACCAACC ACTTTCGTAT CATAATTTCT TTTTGGTAAA 3180 
TATGAG ACTA TTATGAAATC ATAGTATATG ATTGTATTTA AAGGTACAAT CAAAGGATCT 3240 
TTTGTCCATT CCATTAATAA CTGAATAAAA AATAAATAAA ATGGATAGAA AAAAACTAAA 3300 
GTTGAAAATA CATTCTTAAA CTAGTTGTCT G AAATG AGAA AAGAGTGAGA ACTAGGTGTG 3360 
CAAGAACCAA ACGTATTTTA TTTTATTTTT TAAATGGG AG CAACATATCA GTCGTGTCAC 3420 
CAGCTGGTAT ATTGTGTAAA TATTAAAGCT CCATTGGGAC TGATTTTTCA TGGCAACATC 3480 
AGCTTTCTAA TGTTCTAAAT TCTATAAAAA CCACCCACAA AG AAACAAAG CAAATTTC AT 3540 
TATCTAATGA GTTGCTGGAA AATCATATTG AGAATAATTA TTTCAG ATTC CTCAGTTGTT 3600 
AACTTCTACA TTCAAGGGCT TATCTCTGCC COCATTGATT TTTAACCTCA AAATGGTGTG 3660 
AGATTTACTG TGG AACCCTA AAGCAGTAAA ATAAAAAACC TGGTTGCAGC ACATTCACAC 3720 
TGTTGTCCTT AAAATTCCCC TTTTTTCTCT ATGTACGATA AAGTAACAGT ATGTCAGATA 3780 
AGCCGGTGGG GGG ATG AG AT TAGGCTGAGG CAGTGCTAGT CAACTGGGGG AAAAGGATGA 3840 
TGGAAAAATC ACCCAGTTGT GCTATATTTT TAAAGAAGGA GGTCGTTTAT GTGTGCAGAC 3900 
AATTCTOCCT GAGGTTAGCC CAATGG AGAA ATG AAGCAG A GGAAGGAAAC ATAGAAAGAC 3960 
ATGGGCTATC AGGG AGGAAG ATGTTCAATA GAACATGCAA GAATTTCTGG AAGAAAGGCT 4020 
GTGGAAGGGC CAATGGAGAA AATGAATGGA CAAAGCTCAG GAATCCCTAC GCTATGTAGA 4080 
ATGTTCTTGG TGTTATCAGG GTTAAGCCCT GTAATTATGT AACCTATTTA TCGCAACATG 4140 
AATTTTTATG ATTTCTTGTG ATGTATTCTT TTAT GAAATT AACAAGAACT CATTATTTTG 4200 
AGGTAGAGGA AAATCAATGC TTTATCTG AT ATGCTGAGAA ATTATTAG AT TGOCAATACt 4260 
CATGTGCGTT TCATGTGTTT TATAAGGTTT GTTCCTTTG A AG AATTGTAG TTCTTAGTOC 4320 
CACAGGGAAA TGTGTATCTA TTTATATATC ATAGTATAAA TCTATGATAT ATTTATATCA 4380 
TATATAAAAG TCTG AGTTCT CTTTCTTA GT CCCTAATCAT GTTTCTCCCA TAGGCTGTGT 4440 . 
TTACATGG AG CTATCGGTTT AGCCTTTTAA GCTTCATTAG CTTGTCTATT ATTGAAATAG 4500 
TTTCCAAG AA ATTTTAG ATA TTATCATAAC ATCTGGGTCT ACTCAAACAC TTATTGTTTG 4560 
AAAGACTTAT GTCTTGGAOC TATCAAAAAC TGACTTTATT TATTGCTTAG TGAAAATACT 4620 
AGTGGGATCA ACAATGATTT TCTTGAATGG GCATGAATGG AGATGCCCGC ACAGTAATGT 4680 
AGAAATGTTT CATACAGCTA TTAAAATGTA ACTGACCTCC TTAGAGGCAG ATTAGTAACT 4740 
GTTCCTACTT TGTATAGCTA AGTGACAGTC ACTTAACTTA CATGACTTTC TTTTTTCACA 4800 
TTGGGTCTCT GK3TCCTGTGT CTTCACCTCA TTTATAGCAC GTCTCCTTGA TTTTTGGTAG 4860 
TATCAACTTC CCAGTG ATCT GTTCAGTTAA GTTCTTCTOC CGTTAACCAG GAAGTGCTTA 4920 
TTCTCTCATC ACAGTGGGAA GAATAGCCTA TTGTCTTTCA TTTTGCCTGA GTGTATTTTA 4980 
CTATTTGGGC TCTG AAATAA AAATTATGAA ATATGGTGAG GTCACATGTT GGTGCTGCCT 5040 
TGCTGCATAA AATTCTAGGA GGGCAGGTTA GGAGACAGTT ATGTATGGCC TTTCGGGAAA 5100 
ATTCAAAGGG TGGGATTACA AGGGTGTTCC TCAGGCATGC CCCTATGGGC CCTATGTGGA 5160 
AGCAAG AAGA ATTGACTGAT TTACAGG ACT TCTCTTTATG TCAATCTTAA GAGGATGGAT 5220 
GAATCTGGAC ATTTGTTCCA CCCG ACCTCT GACTG ATGGT TTGGAAAATA ACTTTAATTA 5280 
GGATCATATG ACCATTGAAA AAGGAAAAAT GTAGACTCTG ACTTCCGTCC CACTGAAGGA 5340 
TTAATGAAAA CCTTTACTAG CATTTAGAGC TTTTCAGAAC ATCCCCACTG TCATGTGTCT 5400 
CAGCAGTGGA GACTGCAAGT AAGGCTTTTA ATTTTAGGAG GTTTrTTTTT TTTTTTTTTT 5460 
TTCCCCTAAA TGGTATGGCC AAAAGTCAGA GTTAAAATAT ATATAGTTAG ATTCGAACTT 5520 
CCTCCTTCAC TCTAAAAATA GAATCCAAAC CCACTCTTCA TATATGCTTC CAGAATGGGG 5580 
CTTAAGTACC AATCTCTGCT TTGCAATGGG CACAATCTTG GTCATGTCCT GAGGCTCTCT 5640 
AAGAAAAGAG AGG ATCTAGG ATGGGAGAGC TAGAAAGTTG CTAACTGGGA AGAACAAGGC 5700 
CCTG AGGGGT TGGTCTACCA ATCTGGGAAG ATTTG AAAAC AAACTTCTCG CAACTG AAGG 5760 
AAGGCTGAAG GCTGCTGCAA GTCATTGAGT GACTTTAGG A TGAGCAAAAC ATTGGGCCAC 5820 
TTCCTAATGC OCTATGTGTA TAGTACCAGA AGCAAGGTCT CAGACTTAAC AGACCCAGCT 5880 
CTGTTCCAAG GTGAGTCTGA ACCAATAGAA AGCAAACATG TGCAGATATC CAAACAAGAC 5940 
TGCTCATGCA AGTCGGGGCTGGCTACCCGT CTTAGGCAGC AACAGCAGAG CTCCAGGGAG 6000 
CTTATTCAAT ATTTACTGAG ACTTCGAAGA CCCAGCAGAT GTTTAATGAA GTCACTATTT 6060 
TGGCTCAAAC CCTCCACTTC TCCCCCTCCC CTCAAAAAGC CAACAGGTAA ACACATAAAT 6120 
GAAAGAAACC CACAGAAGGG GATGGGAAAT AAAGAAAATT CTCTCAAGAC TTCTCCAGGC 6180 
CCATGTCACT GGTC AGCGTG GTTTTTATGT GTATTAGGAT TGGGGGATGT GAAGAAATAA 6240 
GTATCCAG T A CTTT ATAACC AAAGCAATTA AATGATATTG GGGTAGGGAA TGTTGGCCAG 6300 
TTTTGTTTAG TTTTGCCATC ACATTGTCAC CCAGACCTCA CCTAGCCCCA AGTAATCGGG 6360 
CGCCCCGAAG AGGG AG ACAG AGATGTGCCA GAGTTGACCC AGTGTGCGGA TG ATAACTAC 6420 
TGAC GAAAG A GTCATCGACC TCAGTTAGTG GTTGGATGTA GTCACATTAG TTTGCCTCTC 6480 
CCCATCTTTG TCTCCCTGGC AAGGAGAATA TGCGGGACAT GATGCTAAGA GCCCTGGGTA 6540 
AATGTGGTG A G AATGCACGC GTGCATATGC TACAC ATATG TGCTTCTCAG TTGCAG AAAA 6600 
TGAACTGCTT TGGGAGATTA TCAGTAGAAA GAGTGTTATC ATATTGGTGC TG AGTGCTAT 6660 
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GTGTGCTTAT ACAATTTGTT CTTGTATITT AATAAACTTT GAATAAAAGA ATAAAAAAA A 6720 
AAAAAAAAAA AAAAA 



Protein Accession*: NP.054644.1 

1 11 21 31 41 51 
I I I I I I 

MGILS VDIX1 TLQILPVFFS NCLFLALYDS VILLKHWLL LSRSKSTRGE WRRMLTSEGL 60 
RCVWKSFLLX) AYKQVKLGED APNSSWHVS STEGGDNSGN GTQEKIAEGA TCHLLDFASP 120 
ERPLWNFGS ATXPPFTSQL PAFRKLVEEF SS VADFLLVY IDEAHPSDGW AIPGDSSLSF 180 
EVKKHQNQED RCAAAQQLLE RFSLPPQCRV VADRMDNNAN IAYGVAFERV CIVQRQK1A Y 240 
LGGKGPFSYN LQEVRHWLEK NFSKRXKKTR LAG 



SEQ (D NO:133 PFH5 DMA SEQUENCE 

Nucleic Acid Accession*: NM.001141 

Coding sequence: 72-2102 (underlined sequences correspond to start and stop cote) 

1 11 ^1 31 41 51 
I I I I I I 

CAGGCGTGTC CCAGGGGGAG OXCGCTCTG CAGCCCTGTG CGCCGTAGAG AGCTGGACTT 60 
AGGCTGGCAG CAJQOCCG AG TTCAGGGTCA GGGTGTCCAC CGGAGAAGGC TTCGGGGCTG 120 
GCACATGGGA CAAAGTGTCT GTCAGCATCG TGGGGACOCG GGGAGAGAGC CCGCCACTGC 180 
CCCTGGACAA TCTCGGCAAG GAGTTCACTG CGGGCGCTGA GGAGGACTTC CAGGTGACGC 240 
TCOCGGAGGA CGTAGGCCGA GTGCTGCTGC TGCGCGTGCA CAAGGCGOCC CCAGTGCTGC 300 
CCCTGCTGGG GCCCCTGGCC CCGGATGCCT GGTTCTGCCG CTGGTTCCAG CTGACACCGC 360 
CGCGGGGCGG CCACCTOCTC TTCCCCTGCT ACCAGTGGCT GGAGGGGGCG GGGACCCTGG 420 
TGCTGCAGGA GGGTACAGCC AAGGTGTCCT GGGCAGACCA CCACCCTGTG CTCCAGCAAC 480 
AGCGCCAGGA GGAGCTTCAG GCCCGGCAGG AGATGTACCA GTGGAAGGCT TACAACCCAG 540 
GTTGGCCTCA CTGCCTGGAT GAAAAGACAG TGGAAGACTT GGAGCTCAAT ATCAAATACT 600 
CCACAGCCAA G AATGCCAAC TTTTATCTAC AAGCTGGCTC TGCTTTTGCA GAG ATX5 AAAA 660 
TCAAGGGGTT GCTGGACCGC AAGGGGCTCT GGAGG AGTCT GAATGAGATG AAAAGGATCT 720 
TCAACTTOCG GAGGACCCCA GCAGCTGAGC ACGCATTTGA GCACTGGCAG GAGGATGCCT 780 
TCTTCGCCTC CCAGITCCTG AATGGTCTC A ACCCTGTCCT G ATCCGCCGC TGTCACTACC 840 
TCCCAAAGAA CTTCCCCGTC ACTGATGCCA TGGTGGCCTC ATTGTTGGGT CCTGGGACCA 900 
GCTTGCAGGC TGAGCTAGAG AAGGGCTCCC TGTTCTTGGT GGATCACGGC ATCCTCTCTG 960 
GCATCCAGAC CAATGTCATT AATGGGAAGC CGCAGTTCTC TGCGGCCCCA ATGACCCTGC 1020 
TATACCAGAG CCCAGGCTGC GGGCCGCTGC TGCCTCTCGC CATCCAGCTC AGCCAGACCC 1080 
CCGGCCCAAA CAGCCCCATC TTCCTGCCCA CTG ATGACAA GTGGGACTGG TTGCTGGCCA 1140 
AGACCTGGGT GCGCAATGCC GAGTTCTCCT TCCATGAGGC CGTCACGCAC CTGCTGCACT 1200 
CACATCTGCT GCCTGAGGTC TTCA(XCTGG CTACCCTGCG TCAGCTGCCC CACTGCCACC 1260 
CTCTCTTCAA GCTGCTG ATC CCGCACACCC GATACACCCT GCACATCAAC ACACTCGCCC 1320 
GGG AGCTGCT TATCGTGCCA GGGCAGGTGG TGGACAGGTC CACAGGCATC GGCATTGAAG 1380 
GCTTCTCTGA GTTGATACAG AGGAACATGA AGCAGCTGAA CTATTCTCTC CTX3TGTCTGC 1440 
CTGAGGATAT CCGGACCCGA GGAGTTGAAG ACATCCCAGG CTACTACTAC CGTG ATGATG 1500 
GGATGCAGAT TTGGGGTGCA GTGG AACGCT TTGTCTCTGA AATCATCGGT ATCTACTACC 1560 
CAAGTGATGA GTCTGTCCAA GATGACAGAG AGCTCCAGGC CTGGGTCAGA GAGATCTTCT 1620 
CCAAGGGCTT CCTAAACCAG GAGAGCTCAG GTATCCCTTC CTCACTGGAG ACCCGGGAAG 1680 
CCCTGGTGCA GTATGTCACC ATGGTGATAT TCACCTGCTC AGCCAAGCAT GCGGCTGTCA 1740 
GTGCAGGGCA GTTTG ACTCC TGTGCTTGG A TGCOCAACCT GCCACCCAGC ATGCAGCTGC 1800 
CACCACCCAC CTCCAAAGGC CTGGCAACAT GCGAGGGCTT CATAGCCACC CTCCCACCTG 1860 
TCAATGCCAC ATGTGATGTC ATCCTTGCTC TCTGGTTGCT GAGCAAGGAG CCTGGAGACC 1920 
AAAGGCCCCT GGGCACCTAT CCGGATGAGC ACTTCACAGA GGAGGCCCCT CGGCGGAGCA 1980 
TCGCCACCTT CCAGAGCCGC CTGGGCCAGA TCTCGAGGGG CATCCAGGAG CGGAACCGGG 2040 
GCCTGGTGCT GCCCTACACC TACCTAG ACC CTCCCCTCAT CGAG AACAGC GTCTCC ATCJ 2100 
MATOXAGG GG AACACAGG CCCAGATGAC ATCCCTTTGA CCACATCGCT CTAGGATAAC 2160 
TGGCACCCAG AG AAAAGGAC TCCTCAGAAA AAACAGGCGC CCATGTGCCT CTCCTGGG AC 2220 
AACCAGACTC TGTAACTCAC CCCCACCACC ATACACACAC ACAAAAACAG AAACAAAATC 2280 
AAAACAGAGA AAGCAGAAAA TCTACCAAGA ACAGAGTCTC AGGACAGAAC CACTGAGTCT 2340 
TTTGGAGGCT CCAAGCCTCA AAGTGCCCGC AGAGCCCACC TTGAGGGTTT TGCTAGTTGG 2400 
111 1G 1 1 110 CGTTTACAGC CGTGGGGCGA AGCACATAAT CCCGCCCCAG GGCCCACTAG 2460 
CATCCACTG A TTGGACCTTA TGGTCACCCA ACTCAAGGAC AGCCACCAAG AAGTGGCTGC 2520 
CAAAGAGACT GGGCGCAGTG GCTCATGCCC ATAATCCCAG CACTTTGGGA GATGGAGGCG 2580 
GGAAAATCAT TTGAGGTCAG AAGTTCAAGG CCAGCCTGGA CGACATAGCG AG ACTCCACC 2640 
TCTACCAAAA AATAAAAATT AAAAAACAAA AAAAAAAAAA AAAAA 



$eq a mm PFHS Ealsia k wence; 

Protein Accession #: NP.001 132.1 

1 11 21 31 41 51 
I I I I I I 

MAEFRVRVST GEAFGAGTWD KVSVSIVGTR GESPPLPLDN LGKEFTAGAE EDPQVTLPED 60 
VGRV1XLRVH KAPPVLPLLG PLAPDAWFCR WFQLTPPRGG HLLFPCYQWL EGAGTLVLQE 120 
GTAKVSWADH HPVLQQQRQE ELQARQEMYQ WKAYNPGWPH CLDEKTVEDL ELNIKYSTAK 180 
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NANFYLQAGS AFAEMKIKGL LDRKGLWRSL NEMKRIFNFR RTPAAEHAFE HWQEDAFFAS 240 
QFLNGLNPVL IRRCHYLPKN FPVTDAMVAS LLGPGTSLQA ELEKGSLFLV DHGILSGIQT 300 
NVINGKPQFS AAPMTLLYQS PGCGPLLPLA IQLSQTPGPN SPIFLPTDDK WDWLLAKTWV 360 
RNAEFSFHEA LTHLLHSHLL PEVFTLATLR QLPHCHPLFK LUPHTRYTL HINTLARELL 420 
IVPGQVVDRS TGIGIEGFSE UQRNMKQLN YSLLCLPEDI RTRGVEDIPG YYYRDDGMQI 480 
WGAVERFVSE IIGIYYPSDE SVQDDRELQA WVREIFSKGF LNQESSGIPS SLETREALVQ 540 
YVTMVIFTCS AKHAA VS AGQ FDSCAWMPNL PPSMQLPPPT SKGLATCEGF IATLPPVNAT 600 
CDVILALWLL SKEPGDQRPL GTYPDEHFTE EAPRRSIATF QSRLAQISRG IQERNRGLVL 660 
PYTYLDPPU ENSVSI 



SEQ ID NO:135 PFH4 DNA SEQUENCE 

Nucleic Add Accession*: NM.002742 

Coding sequence: 236-2974 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

GAATTCCTTC TCTCCTCCTC CTCGCCCTTC TCCTCGCCCT CCTCCTCCTC CTCGCCCTCC 60 
CCTCCCG ATC CTCATCCCCT TGCCCTCCCC CAGCCCAGGG ACTTTTCCGG AAAGTTTTTA 120 
TTTTCCGTCT GGGCTCTOGG AGAAAGAAGC TCCTGGCTCA GCGGCTGCAA AACTTTCCTG 180 
CTGCCGCGCC GCCAGCCCCC GCCCTCCGCT GCCCGGCCCT GCGCCCCGCC G AGCGATQAG 240 

cxxxxxttccg gtcctgcggc cgcccagtcc gctgctgccc GTGGCGGCGG CAGCTGCCGC 300 

AGCGGCCGCC GCACTGGTCC CAGGGTCCGG GCCCGGGCCC GCGCCGTTCT TGGCTCCTGT 360 
CGCGGCCCCG GTCGGGGGCA TCTCGTTCCA TCTGCAG ATC GGCCTGAGCC GTG AGCCGGT 420 
GCTGCTGCTG CAGGACTCGT CCGGGGACTA CAGCCTGGCG CACGTCCGCG AGATGGCTTG 480 
CTCCATTGTC G ACCAG AAGT TCCCTG AATG TGGTTTCTAC GG AATGTATG ATAAGATCCT 540 
GCTTTTTCGC CATGACCCTA CCTCTGAAAA CATCCTTCAG CTGGTGAAAG CGGCCAGTGA 600 
TATCCAGGAA GGCGATCTTA TTGAAGTGGT CTTGTCACGT TCCGCCACCT TTGAAGACTT 660 
TCAGATTCGT OCCCACGCTC TCTTTGTTCA TTCATACAGA GCTCCAGCTT TCTGTGATCA 720 
CTGTGGAGAA ATGCTGTGGG GGCTGGTACG TCAAGGTCTT AAATGTGAAG GGTGTGGTCT 780 
GAATTACCAT AAGAGATGTG CATTTAAAAT ACCCAACAAT TGCAGCGGTG TGAGGCGGAG 840 
AAGGCTCTCA AACGTTTCCC TCACTGGGGT CAGCACCATC CGCACATCAT CTGCTGAACT 900 
CTCTACAAGT GCCCCTGATG AGCCCCTTCT GCAAAAATCA CCATCAGAGT CGTTTATTGG 960 
TCGAGAGAAG AGGTCAAATT CTCAATCATA CATTGGACGA CCAATTCACC TTGACAAGAT 1020 
TTTGATGTCT AAAGTTAAAG TGCCGCACAC ATTTGTCATC CACTCCTACA OCCGGCCCAC 1080 
AGTGTGCCAG TACTGCAAGA AGCTTCTGAA GGGGCTTTTC AGGCAGGGCT TGCAGTGCAA 1140 
AGATTGCAG A TTCAACTGCC ATAAACGTTG TGCACCGAAA GTACCAAACA ACTGCCTTGG 1200 
CGAAGTGACC ATTAATGGAG ATTTGCTTAG CCCTGGGGCA GAGTCTGATG TGGTCATGGA 1260 
AGAAGGGAGT GATGACAATG ATAGTGAAAG GAACAGTGGG CTCATGGATG ATATGGAAGA 1320 
AGCAATGGTC CAAGATGCAG AGATGGCAAT GGCAGAGTGC CAGAACGACA GTGGCGAGAT 1380 
GCAAGATCCA GACCCAGACC ACGAGGACGC CAACAGAACC ATCAGTCCAT CAACAAGCAA 1440 
CAATATCCCA CTCATGAGGG TAGTGCAGTC TGTCAAACAC ACGAAGAGGA AAAGCAGCAC 1500 
AGTCATGAAA GAAGGATGGA TGGTCCACTA CACCAGCAAG GACACGCTGC GGAAACGGCA 1560 
CTATTGGAGA TTGGATAGCA AATGTATTAC CCTCTTTCAG AATGACACAG GAAGCAGGTA 1620 
CTACAAGGAA ATTOCTTTAT CTGAAATTTT GTCTCTGGAA CCAGTAAAAA CTTCAGCTTT 1680 
AATTCCTAAT GGGGCCAATC CTCATTGTTT CGAAATCACT ACGGCAAATG TAGTGTATTA 1740 
TGTGGGAGAA AATGTGGTCA ATCCTTCCAG CCCATCACCA AATAACAGTG TTCTCACCAG 1800 
TGGCGTTGGT GCAGATGTGG CCAGGATGTG GGAGATAGCC ATCCAGCATG CCCTTATGCC 1860 
CGTCATTCCC AAGGGCTCCT CCGTGGGTAC AGGAACCAAC TTGCACAGAG ATATCTCTGT 1920 
GAGTATTTCA GTATCAAATT GCCAGATTCA AGAAAATGTG G ACATCAGCA CAGTATATCA 1980 
GATTTTTCCT G ATG AAGTAC TGGGTTCTGG ACAGTTTGGA ATTGTTTATG GAGGAAAACA 2040 
TCGTAAAACA GGAAGAGATG TAGCTATTAA AATCATTGAC AAATTACGAT TTCCAACAAA 2100 
ACAAG AAAGC CAGCTTCGTA ATGAGGTTGC AATTCTACAG AACCTTCATC ACCCTGGTGT 2160 
TGTAAATTTG GAGTGTATGT TTGAGACGCC TGAAAGAGTG TTTGTTGTTA TGGAAAAACT 2220 
CCATGGAGAC ATGCTGG AAA TGATCTTGTC AAGTGAAAAG GGCAGGTTGC CAGAGCACAT 2280 
AACGAAGTTT TTAATTACTC AGATACTCGT GGCTTTGCGG CACCTTCATT TTAAAAATAT 2340 
CGTTCACTGT GACCTCAAAC CAGAAAATGT GTTGCTAGCC TCAGCTGATC CTTTTCCTCA 2400 
GGTGAAACTT TGTGATTTTG GTTTTGCCCG G ATCATTGG A G AG AAGTCTT TCCGGAGGTC 2460 
AGTGGTGGGT ACCCCCGCTT ACCTGGCTCC TGAGGTCCTA AGGAACAAGG GCTACAATCG 2520 
CTCTCTAGAC ATGTGGTCTG TTGGGGTCAT CATCTATGTA AGCCTAAGCG GCACATTCCC 2580 
ATTTAATGAA GATGAAGACA TACACGACCA AATTCAGAAT GCAGCTTTCA TGTATCCACC 2640 
AAATCCCTGG AAGG AAATAT CTCATG AAGC CATTGATCTT ATCAACAATT TGCTGCAAGT 2700 
AAAAATGAGA AAGCGCTACA GTGTGGATAA GACCTTGAGC CACCCTTGGC TACAGGACTA 2760 
TCAGACCTGG TTAGATTTGC GAG AGCTGG A ATGCAAAATC GGGGAGCGCT ACATCACCCA 2820 
TGAAAGTGAT GACCTGAGGT GGGAGAAGTA TGCAGGCGAG CAGCGGCTGC AGTACCCCAC 2880 
ACACCTGATC AATCCAAGTG CTAGCCACAG TGACACTCCT GAGACTGAAG AAACAGAAAT 2940 
GAAAGCCCTC GGTGAGCGTG TCAGCATCCT CTGAGTTCCA TCTCCTATAA TCTGTCAAAA 3000 
CACTGTGG AA CTAATAAATA CATACGGTCA GGTTTAACAT TTGOCTTGCA GAACTGCC AT 3060 
TATTTTCTGT CAGATG AGAA CAAAGCTGTT AAACTGTTAG CACTGTTGAT GTATCTGAGT 3120 
TGCCAAGACA AATCAACAGA AGCATTTGTA TTTTGTGTGA CCAACTGTGT TGTATTAACA 3180 
AAAGTTCCCT G AAACACGAA ACTTGTTATT GTGAATGATT CATGTTATAT TTAATGCATT 3240 
AAACCTGTCT CCACTGTGCC TTTGCAAATC AGTGTTTTTC TTACTGG AGC TTCATTTTGG 3300 
TAAGAGACAG AATGTATCTG TG AAGTAGTT CTGTTTGGTG TGTCCCATTG GTGTTGTCAT 3360 
TGTAAACAAA CTCTTG AAG A GTCGATTATT T0C AGTGTTC TATG AACAAC TCCAAAACOC 343) 
ATGTGGGAAA AAAATGAATG AGGAGGGTAG GG AATAA AAT CCTAAGACAC AAATGCATGA 3480 
ACAAGTTTTA ATGTATAGTT TTGAATOCTT TGCCTGCCTG GTGTGCCTCA GTATATTTAA 3540 
ACTCAAG ACA ATGCACCTAG CTGTGCAAGA CCTAGTGCTC TTAAGCCTAA ATGCCTTAGA 3600 
AATGTAAACT GCCAT ATATA AC AGAT AC AT TTOCCTCTTT CTTATAAT AC TCTGTTGTAC 3660 
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TATGGAAAAT CAGCTGCTCA GCAACCTTTC ACCTTTGTGT ATTTTTCAAT AATAAAAAAT 3720 
ATTCTTGTCA AAAAAAAAAA AA 



?eq IP mm P.FH4 Protein sequence; 

Protein Accession #: NP.002733.1 



1 U 21 31 41 51 
I I I I I I 

MSAPPVLRPP SPLLPVAAAA AAAAAALVPG SGPGPAPFLA PVAAPVGGIS FHLQIGLSRE 60 
PVLLLQDSSG DYSLAHVREM ACSIVDQKFP ECGFYGMYDK tLLFRHDPTS ENILQLVKAA 120 
SDIQEGDUE WLSRSATFE DFQIRPHALF VHSYRAPAFC DHCGEMLWGL VRQGLKCEGC 180 
GLNYHKRCAF KIPNNCSGVR RRRLSN VSLT GVSTIRTSSA ELSTSAPDEP LLQKSPSESF 240 
IGKEKRSHSQ SYIGRPIHLD KBMSKVKVP HTFVIHS YTR FTVCQYCKKL LKGLFRQGLQ 300 
CKDCRFNCHK RCAPKVPNNC LGEVTINGDL LSPGAESDW MEEGSDDNDS ERNSGLMDDM 360 
EEAMVQDAEM AMAECQNDSG EMQDPDPDHE DANRTISPST SNNIPLMRW QSVKHTKRKS 420 
STVMKEGWMV HYTSKDTLRK RHYWJLDSKC ITLFQNDTGS RYYKEIPLSE ILSLEPVKTS 480 
AUPNG ANPH CFEITTANVV YYVGENWNP SSPSPNNSVL TSGVGADVAR MWEIAIQHAL 540 
MPVIPKGSS V GTGTNLHRDI SVSISVSNCQ IQENVDISTV YQIFPDEVLG SGQFGIVYGG 600 
KHRKTGRDVA IKIIDKLRFP TKQESQLRNE VAUjQNLHHP GWNLECMFE TPERVFWME 660 
KLHGDMLEMI LSSEKGRLPE HITKFUTQI LVALRHLHFK NIVHCDUCPE NVLLASADPF 720 
PQVKLCDFGF ARUGEKSFR RSWGTPAYL APEVLRNKGY NRSLDMWS VG VDYVSLSGT 780 
FPFNEDEDIH DQIQNAAFMY PPNPWKEISH EAIDUNNLL QVKMRKRYSV DKTLSHPWLQ 840 
DYQTWLDLRE LECKIGERYI THESDDLRWE KY AGEQRLQY PTHLINPS AS HSDTPETEET 900 
EMKALGERVSIL 



SEQ U) NO:137 PFH3 DNA SEQUENCE 

Kucleic Acid Accession #: X95425 

Coding sequence: 712-3825 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I 1 I I I I 

AATGGTC AGT CAATACATTA TAACATAATA CACCAAATGC TAGAATAGAA GGGG AGGGGG 60 
GCACACATAA TGACTCACTG CTGGAAGAAG GGTGCATCAG TGAATTAAAA AATGTCCCTC 120 
GCCTCTTCAG CACTCAGCGC GCAGCTATTT CCTTCTGCCA GTCTCTTTGA ACTCTGGATC 180 
TTTGCTTTTG CTCCCTGCTC TCCT G 11111 CATTCTCCAC ATTTTCTCAA TCCTCTTTCT 240 
TTATCCTTAG CCACCCTGCT TTTI 1C CTCC 1 1 1 1 1 1A AAA AATCGGAG AT TTCGTCTTAA 300 
AATGATTTGT CTTCCTTACC TTCGTCCATT TCAACACTGA AGGCTGCAAA GAACTTCACC 360 
TTTCCCCTAG TGGTATTTAA AAATTCTCAA TCCGTAAAAA GTCTTTTTGA AAGGCAAAGG 420 
AACAGGAOCC AG ACCCTCTC GACACCCTTG ATCCG AGTCA GATCTGCACT AGCAACCAGA 480 
ACTAATATTT CATTTAACCC ACCAAAAGGG GGAGGCGAGA GG AGCCAG AA GCAAACTTCA 540 
TCTGTCTCAG ACGGATCCGT GGTTCCTACA TTTGGAGGAG CCGCGTGTCA GAAGGCGTAG 600 
GACCOCAAGG GGGGACAAGG AGGACTOCCG AGTCTCCCTT CTOCGCTCTC CGAGACCGAA 660 
GAGGTGGACT GAGCCGCTCG GGACAGCGGC ACCGGAGGAG GCTCGGAGAA GAT£CGGGGC 720 
TOGGGGCCCC GGGGTGCGGG ACACCGGCGG CCCCCAAGCG GCGGCGGCGA CACCCCCATC 780 
ACCCCAGCGT CCCTGGCCGG CTGCTACTCT GCACCTCGAC CKKjCTCCCCT CTGGACGTGC 840 
CTTCTCCTGT GCGCCGCACT CCGGACOCTC CTGGCCAGCC CCAGCAACGA AGTGAATTTA 900 
TTGGATTCAC GCACTGTCAT GGGGGACCTG GGATGGATTG CTTTTCCAAA AAATGGGTGG 960 
GAAGAGATTG GTGAAGTGGA TG AAAATTAT GCCCCTATCC ACACATACCA AGTATGCAAA 1020 
GTGATGGAAC AGAATCAGAA TAACTGGCTT TTGACCAGTT GGATCTCCAA TGAAGGTGCT 1080 
TCCAGAATCT TCATAG AACT CAAATTTACC CTGCGGGACT GCAACAGCCT TCCTGGAGGA 1140 
CTGGGGACCT GTAAGG AAAC CTTTAATATG TATTACTTTG AGTCAGATGA TCAGAATGGG 1200 
AGAAACATCA AGGAAAAOCA ATACATCAAA ATTGATACCATTGCTGCCGA TGAAAGCTTT 1260 
ACAG AACTTG ATCTTGGTG A CCGTGTTATG AAACTGAATA CAGAGGTCAG AGATGTAGG A 1320 
CCTCTAAGCA AAAAGGGATT TTATCTTGCT TTTCAAGATG TTGGTGCTTG CATTGCTCTG 1380 
GTTTCTGTGC GTGTATACTA TAAAAAATGC CCTTCTGTGG TACGACACTT GGCTGTCTTC 1440 
CCTGACACCA TCACTGGAGC TGATTCTTCC CAATTGCTCG AAGTGTCAGG CTCCTGTGTC 1500 
AACCATTCTG TGACCGATGA ACCTCCCAAA ATGCACTGCA GCGCCGAAGG GGAGTGGCTG 1560 
GTGCCCATCG GGAAATGCAT GTGCAAGGCA GGATATGAAG AGAAAAATGG CACCTGTCAA 1620 
GTGTGCAGAC CTGGGTTCTT CAAAGCCTCA CCTCACATCC AGAGCTGCGG CAAATGTCCA 1680 
CCTCACAGTT ATACCCATGA GG AAGCTTCA ACCTCTTGTG TCTGTGAAAA GGATTATTTC 1740 
AGGAGAG AGT CTGATCCACC CACAATGGCA TGC ACAAGAC CCCCCTCTGC TCCTCGGAAT 1800 
GCCATCTCAA ATGTTAATGA AACTAGTGTC TTTCTGGAAT GGATTCCGCC TGCTGACACT I860 
GGTGGAAGG A AAG ACGTGTC ATATTATATT GCATGCAAG A AGTGCAACTC CC ATGCAGGT 1920 
GTGTGTGAGG AGTGTGGCGG TCATGTCAGG TACCTTCCCC GGCAAAGCGG CCTG AAAAAC 1980 
ACCTCTGTCA TG ATGGTGG A TCTACTCGCT CACACAAACT ATACCTTTG A G ATTGAGGCA 2040 
GTGAATGGAG TGTCCGACTT GAGCCCAGGA GCCCGGCAGT ATGTGTCTGT AAATGTAACC 2100 
ACAAATCAAG CAGCTCCATC TCCAGTCACC AATGTGAAAA AAGGGAAA AT TGC AAAAAAC 2160 
AGCATCTCTT TGTCTTGGCA AG AACCAGAT CGTCCCAATG GAATCATCCT AGAGTATGAA 2220 
ATCAAGCATT TTGAAAAGGA CCAAGAGACC AGCTACACGA TTATCAAATC TAAAGAGACA 2280 
ACTATTACTG CAG AGGGCTT GAAACCAGCT TCAGTTTATG TCTTCCAAAT TCGAGCACGT 2340 
ACAGCAGCAG GCTATGGTGT CTTCAGTCGA AGATTTGAGT TTGAAACCAC CCCAGTGTTT 2400 
GCAGCATCCA GCGATCAAAG CCAGATTCCT GTAATTGCTG TGTCTGTGAC AGTAGGAGTC 2460 
ATTTTGTTGG CAGTGGTTAT CGGCGTCCTC CTCAGTGGAA GTTGCTGCG A ATGTGGCTGT 2520 
GGG AGGGCTT CTTCCCTGTG CGCTGTTGCC CATCCAATCC TAATATGGCG GTGTGGCTAC 2580 
AGCAAAGCAA AACAAGATCC AGAAG AGG AA AAGATGCATT TTCATAATGG GCACATTAAA 2640 
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CTGCCAGG AG TAAGAACTTA CATTG ATCCA CATACCTATG AGGATCCCAA TCAAGCTGTC 2700 
CACGAATTTG CCAAGGAGAT AGAAGCATCA TGTATCACCA TTGAGAGAGT TATTGGAGCA 2760 
GGTGAATTTG GTGAAGTTTG TAGTGGACGT TTGAAACTAC CAGGAAAAAG AGAATTACCT 2820 
GTGGCTATCA AAACCCTTAA AGTAGGCTAT ACTGAAAAGC AACGCAGAGA TTTCCTAGGT 2880 
GAAGCAAGTA TCATGGGACA GTTTGATCAT CCTAACATCA TCCATTTAGA AGGTGTGGTG 2940 
ACCAAAAGTA AACCAGTGAT GATCGTGACA GAGTATATGG AGAATGGCTC TTTAGATACA 3000 
TTTTTG AAG A AAAACGATGG GCAGTTCACT GTGATTCAGC TTGTTGGCAT GCTGAGAGGT 3060 
ATCTCTGCAG G AATG A AGTA CCTTTCTG AC ATGGGCTATG TGCATAG AG A TCTTGCTGCC 3120 
AGAAACATCT TAATCAACAG TAACCTTGTG TGCAAAGTGT CTGACTTTGG ACTTTCCCGG 3180 
GTACTGGAAG ATG ATCOCG A GGCAGCCTAC ACCACAAGGG G AGG AAAAAT TCCAATCAG A 3240 
TGGACTGCCC CAGAAGCAAT AGCTTTCCGA AAGTTTACTT CTGCCAGTGA TGTCTGGAGT 3300 
TATGGAATAG TAATGTGGGA AGTTGTGTCT TATGGAGAGA GACCCTACTC GGAGATGACC 3360 
AATCAAGATG TGATTAAAGC GGTAGAGGAA GGCTATCGTC TGCCAAGCCC CATGGATTGT 3420 
CCTGCTGCTC TCTATCAGTT AATGCTGGAT TGCTGGCAGA AAGAGCGAAA TAGCAGGCCC 3480 
AAGTTTGATG AAATAGTCAA CATGTTGG AC AAGCTGATAC GTAACCCAAG TAGTCTG AAG 3540 
ACGCTGGTTA ATGCATCCTG CAGAGTATCT AATTTATTGG CAGAACATAG CCCACTAGGA 3600 
TCTGGGGCCT ACAGATCAGT AGGTGAATGG CTAGAGGCAA TCAAGATGGG CCGGTATACA 3660 
GAGATTTTCA TGGAAAATGG ATACAGTTCA ATGGACGCTG TGGCTCAGGT GACCTTGGAG 3720 
GATTTGAGAC GGCTTGG AGT GACTCTTGTC GGTCACCAGA AGAAGATCAT GAACAGCCTT 3780 
CAAGAAATGA AGGTGCAGCT GGTAAACGGA ATGGTGCCAT TGTAACTTCA TGTAAATGTC 3840 
GCTTCTTCAA GTGAATGATT CTGCACTTTG TAAACAGCAC TGAGATTTATTTTAACAAAA 3900 
AAA 



SEQID Na.l38EafiMfijnifi2a§DS£ 
Protein Accession*: CAA64700.1 



1 11 21 31 41 51 
I I I I I I 

MRGSGPRGAG HRRPPSGGGD TPITPASLAG CVSAPRRAPL WTCLLLCAAL RTLLASPSNE 60 
VNLLDSRTVM GDLGWIAFPK NGWEHGEVD ENYAPIHTYQ VCKVMEQNQN NWLLTS WISN 120 
BGASRDFIEL KFTLRDCNSL PGGLGTCKET FNMYYFESDD QNGRNIKENQ YIKIDTIAAD 180 
ESFTELDLGD RVMKLNTEVR DVGPLSKKGF YLAFQDVGAC IALVSVRVYY KKCPS WRHL 240 
AVFPDTTTGA DSSQLLEVSG SCVNHSVTDE PPKMHCSAEG EWLVPIGKCM CKAGYEEKNG 300 
TCQVCRPGFF KASPHIQSCG KCPPHSYTHE EASTSCVCEK DYFRRESDPP TMACTRPPSA 360 
PRNAISNVNE TSVFLEWIPP APTGGRKDVS YYIACKKCNS HAGVCEECGG HVRYLPRQSG 420 
LKNTS VMMVD LLAHTNYTFE IEAVNGVSDL SPGARQYVSV NVTTNQAAPS PVTNVKKGKI 480 
AKNSISLSWQ EPDRPNGUL EYELKHFEKD QETSYTUKS KETTtTAEGL KPASVYVFQI 540 
RARTAAGYGV FSRRFEFETT PVFAASSDQS QEPVIAVS VT VGVILLAWI GVLLSGSCCE 600 
CGCGRASSLC AVAHPHJWR CGYSKAKQDP EEEKMHFHNG HUCLPGVRTY IDPHTYEDPN 660 
QAVHEFAKEI EASOTIERV IGAGEPGEVC SGRLKLPGKR ELPVAHCTLK VGYTEKQRRD 720 
FLGEASIMGQ FDHPNIIHLE GWTKSKPVM IVTEYMENGS LDTFLKKNDG QFTVIQLVGM 780 
LRGISAGMKY LSDMGYVHRD LAARNIUNS NLVCKVSDFG LSRVLEDDPE AAYTTRGGKI 840 
P1RWTAPEA1 AFRKFTS ASD VWS YGIVMWE WS YGERPYW EMTNQDVDCA VEEG YRLPSP 900 
MDCPAALYQL MLDCWQKERN SRPKFDETVN MLDKURNPS SUCTLVNASC RVSNLLAEHS 960 
PLGSGAYRSV GEWLEAIKMG RYTEIFMENG YSSMDAVAQV TLEDLRRLGV TLVGHQKKIM 1020 
NSLQEMKVQL VNGMVPL 



S£Q 10 NO:139 PFH2 DNA SEQUENCE 

Nucleic Add Accession*: NM.016029 

Cooing sequence: 78-1097 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I 1 I I 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTC1 1U11C CCCCCGAGCT 60 
GGGCGTGCGC GGCCGCAATG AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 
TGCTOCTGCT CTTGGTGC AG CTGCTGCGCT TCCTG AGGGC TG ACGGCGAC CTGACGCTAC 180 
TATGGGCCGA GTGGCAGGG A CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 
TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 
TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 
TAG AGAATGG CAATTTAAAA GAAAAAG ATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 
CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 
TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 
ACAG AAAGCT AATAG AGCTT AACTACTTAG GG ACGGTGTC CTTGACAAAA TGTGTTCTGC 600 
CTCACATGAT CG AG AGG AAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGTATCA 660 
TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GG1 1 111 I I A 720 
ATOGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 
GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 
GCAATAATGG AGACCAGTCC CACAAGATG A CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 
TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 
CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAG A 1020 
AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 
AGACAAAACA TGAdQAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1 140 
AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 
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ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGCCAT G AATCTTGCA AA 



SEQ ID NO:140 PFH2 Protein sequence: 
Protein Accession #: NPJJ571 1 3.1 

1 11 21 31 41 51 
I I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 
GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 
ATKA VLQEFG RIDILVNNGG MSQRSLCMDT SLDV YRKliE LNYLGTVSLT KCVLPHMIER 180 
KQGKIVTVNS ILGUSVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 
IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVW1SEQP FLLVTYLWQY 300 
MPTWAWWITN KMGKKRIENF KSG VDADSSY FKIFKTKHD 



SEQ ID N0:141 PFH1 DNA SEQUENCE 

Nucleic Acid Accession!: NMJJ21614 

Coding sequence: 1-1 740 (underfined sequences correspond to start and stop codons) 

1 II 21 31 41 51 
I I I I I I 

AIQAGCAGCT GCAGGTACAA CGGGGGCGTC ATGCGGCCGC TCAGCAACTT GAGCGCGTCC 60 
CGCCGGAACC TGC ACG AG AT GG ACTCAG AG GCGCAGCCCC TGCAGCCCCC CGCGTCTGTC 120 
GGAGGAGGTG GCGGCGCGTC CTCCCCGTCT GCAGCCGCTG COGCCGCCGC CGCTGTTTCG 180 
TCCTCAGCCC CCGAGATCGT GGTGTCTAAG OCCGAGCACA ACAACTCCAA CAACCTGGCG 240 
CTCTATGGAA CCGGCGGCGG AGGCAGCACT GGAGGAGGCG GCGGCGGTGG CGGGAGCGGG 300 
CACGGCAGCA GCAGTGGCAC CAAGTCCAGC A AAAAGAAAA ACCAGAACAT CGGCTACAAG 360 
CTGGGCCACC GGCGCGCCCT GTTCGAAAAG CGCAAGCGGC TCAGCGACTA CGCGCTCATC 420 
TTCGGCATGT TCGGCATCGT GGTCATGGTC ATCG AG ACCG AGCTGTCGTG GGGCGCCTAC 480 
GACAAGGCGT CGCTGTATTC CTTAGCTCTG AAATGCCTTA TCAGTCTCTC CACGATCATC 540 
CIX5CTCGGTC TGATCATCGT GTACCACGCC AGGGAAATAC AGTTGTTCAT GGTGGACAAT 600 
GGAGCAGATG ACTGG AGAAT AGCCATGACT TATGAGCGTA TTTTCTTCAT CTGCTTGGAA 660 
ATACTGGTGT GTGCTATTCA TCOCATACCT GGGAATTATA CATTCACATG GACGGCCCGG 720 
CTTGCCTTCT CCTATGCCCC ATCCACAACC ACCGCTG ATO TGGA TATTAT TTTATCTATA 780 
CCAATGTTCT TAAGACTCTA TCTGATTGCC AGAGTCATGC TTTTACATAG CAAACTTTTC 840 
ACTGATGCCT CCTCTAGAAG CATTGGAGCA CTTAATAAGA TAAACTTCAA TACACGTTTT 900 
GTTATG AAG A CTTTAATGAC TATATGCCCA GGAACTGTAC TCTTGGTTTT TAGTATCTCA 960 
TTATGGATAA TTGCCGCATG GACTGTCCGA GCTTGTGAAA GGTACCATGA TCAACAGGAT 1020 
GTTACTAGCA ACTTCCTTGG AGCG ATGTGG TTGATATCAA TAACTTTTCT CTCCATTGGT 1080 
TATGGTGACA TGGTACCTAA CACATACTGT GGAAAAGGAG TCTGCTTACT TACTGGAATT 1140 
ATGGGTGCTG GTTGCACAGC CCTGGTGGTA GCTGTAGTGG CAAGGAAGCT AGAACTTACC 1200 
AAAGCAGAAA AACACGTGCA CAATTTCATG ATGGATACTC AGCTGACTAA AAGAGTAAAA 1260 
AATGCAGCTG CCAATGTACT CAGGGAAACA TGGCTAATTT ACAAAAATAC AAAGCTAGTG 1320 
AAAAAGATAG ATCATGCAAA AGTAAGAAAA CATCAACGAA AATTCCTGCA AGCTATTCAT 1380 
CAATTAAGAA GTGTAAAAAT GGAGCAGAGG AAACTGAATG ACCAAGCAAA CACTTTGGTG 1440 
GACTTGGCAA AGACCCAGAA CATCATGTAT GATATGATTT CTGACTTAAA CGAAAGGAGT 1500 
GAAGACTTCG AGAAGAGGAT TGTTACCCTG GAAACAAAAC TAGAGACTTT GATTGGTAGC 1560 
ATCCACGCCC TCCCTGGGCT CATAAGCCAG ACCATCAGGC AGCAGCAGAG AGATTTCATT 1620 
GAGGCTCAG A TGG AGAGCTA CGACAAGCAC GTCACTTACA ATGCTG AGCG GTCCCGGTCC 1680 
TCGTCCAGGA GGCGGCGGTC CTCTTCCACA GCACCACCAA CTTCATCAGA GAGTAGCTA^ 



SEQ ID Nfr.142 PFH1 Protein sequence: 
Protein Accession #: NPJJ67627 

1 11 21 31 41 51 
I I I I I I 

MSSCRYNGGV MRPLSNLSAS RRNLHEMDSE AQPLQPPASV GGGGGASSPS AAAAAAAAVS 60 
SSAPEIWSK PEHNNSNNLA LYGTGGGGST GGGGGGGGSG HGSSSGTKSS KKKNQNIGYK 120 
LGHRRALFEK RKRLSD YAL1 FGMFGIWMV EETELSWGAY DKASLYSLAL KCL1SLSTU 180 
LLGLTJVYHA RHQLFM VDN GADDWRIAMT YERJFHCLE TXVCAIHPIP GNYTFTWTAR 240 
LAFSYAPSTT TAD VDIDLSI PMFLRLYUA RVMLLHSKLF TDASSRSIGA LNKINFNTRF 300 
VMKTLMTICP GTV1XVFSIS LWIIAAWTVR ACERYHDQQD VTSNFLGAMW USITFLSIG 360 
YGDMVPNTYC GKGVCLLTGI MG AGCTALVV AVVARKLELT KAEKHVHNFM MDTQLTKRVK 420 
NAAANVLRET WUYKNTKLV KK1DHAKVRK HQRKFLQAIH QLRS VKMEQR KLNDQANTLV 480 
DLAKTQNIMY DMISDLNERS EDFEKRIVTL ETKLETUGS IHALPGLISQ TIRQQQRDFI 540 
EAQMESYDKH VTYNAERSRS SSRRRRSSST APPTSSESS 



SEQ ID NO:143 PFG9 DNA SEQUENCE 
Nucleic Acid Accession I: AL1 10139, coding region Is FGENESH predicted 
Coding sequence: 1-1896 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
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I I I I I I 

AJ&CGCGCCG TGCCGCTGCC CGCCCCGCTC CTGCCGCTGC TGCTGCTCGC GCTCCTGGCC 60 
GCTCCCGCCG CCCGCGCCAG CAG AGCCGAG TCCGTCTCCG CGCCGTGGCC CGAACCCGAG 120 
CGCGAGTCGC GGCCACCGCC CGGCCCGGGG CCCGGGAACA CCACCCGGTT TGGGTCTGGG 180 
GCGGCGGGCG GCAGCGGCAG CTCCAGCTCC AACAGCAGTG GCG ACGCCTT GGTGACCCGC 240 
ATTTCCATCC TCCTCCGCGA CCTACCCACC CTCAAGGCAG CCGTGATCGT GGCGTTCGCC 300 
TTTACCACCC TCCTCATCGC CTGCCTGCTG CTGCGCGTCT TCAGGTCGGG AAAGAGGTTA 360 
AAGAAGACAC GCAAGTATGA TATCATCACC ACTCCAGCAG AGCGAGTGGA AATGGCGCCA 420 
CTAAATGAAG AGGATG ATGA AGATGAGGAC TCCACAGTAT TCGACATCAA ATACAGAGTG 480 
TCCTTGCCGG CTGCACTG AG ACGTCAGCTG CCAGGGTGCC AG ACGCTACT GACAGTTCCT 540 
GTGCCCCCAC CCTTCATCCT CGACATTG AC CTTCCAGCAA GATGCAGTGG AAGGCCTGAT 600 
GGTGGAATCA G ACCTGGTAA AACCTGTTTC CCAGCCTGGT GGCATCCTGT GGAAAGTTGG 660 
TCAGCTGCAA CCTGGGGTGT GAAGG ACTGG ACCTGG AAGC CCTCTTGCGT CGG AGGTGTT 720 
GAAACCAAAA CG AACGTTAT GTATAAAACC CCAGCTCCAT CGTGCGTGTC AGGCATCTGC 780 
TCAGACTGTC ACTGGCAAGC TCGTTTCCAC GTCACCACAA TGGAGTTGCT TCTGCCAGCC 840 
TTTGGGCATC CCTTTAAAGT GCCCCCTACT TCTACTCCCC ATGGTTTTCG ACAACTGCAG 900 
CTGAATCTCA TGG AAAAGCT GGATTCCTCT GCCTTACGCA GAAACACCOG GGCTCCATCT 960 
GCCAGGTGCT TGCCACTGGT CCTGGCAGAA ATGGCGGCTG CTGAAAGTGA CCTTCCAAAT 1020 
CCTTGGTGGC ACTTCAGCGC CACAGGCTCT CCAATAAAAA CCCTTTACAC ACAAACCATG 1080 
AGTACCTTGG GCTTGGATGT TTTCTGTGGT GCCGGCCAGC GGGGCACCTT TTGTG AAG AC 1140 
AG AGCAGTG A CTAAGGTTCT CCAGGGTAGC TPCTTTCTCC A AACAGCTGCG CTGG AAGCCA 1200 
GCCCTAGAGA GTGGGTTTCC CCATCATCTC AGGCTTCTCA GAGAGTGTCC TCOGCTGAGC 1260 
ACCCATCCTG TCAGGTTGGC TCGTTCAGAT GCCCGGGGAC AAGCCAGCCT GACGGGGAGG 1320 
AGGGTGTTTC GGCGTCCGCG GCAGTCTCTG CATGGCGGAG GGTCAGCGGG TACCGCAACT 1380 
TGCCTTTTGG TTTTGAAG AT TCTGTTGAGG CGCCATCCTC ACCTTGACCT CTTCTACAAA 1440 
ATCTGTCTOC CCTGCTGTGC CGTGGA ACAC CTACGGGAAG CCAAG AGAAG CTCAGTGACT 1500 
GTCCTTGCGT CATTTGAGCA GAGCCCACAA AAGGCAGCTG CTGCCCACGG GGAGCCTGTC 1560 
AAACGAGGGC CCAGTGGGCA ATTGACCAGA CACACATGCC CTGGCTGGGG GATCACACAT 1620 
GCGAACCTGC AGACAATTCC AGATACCCAA GGCCAGGAAG GCCCACGTGA GG ATGTCACT 1680 
CACCCTGGAG GAGACTTGGA TGGGGTGGCA AATTTCTATT TGGAGGAAGA GGGTTTCCAG 1740 
GATGGCAGAT GCCAGAAGAT GGTOCTGATG TCTGAGGAAG GGCCACCTAG TTTGACAGGA 1800 
TGTGAG AGGC TCACAGGTTC CCATCACTTC TCCAGCCATT CCAAGTCTTG GTCCTTCCTT I860 
TCCCCCCGAC AGCCCCTGTT TCTGTCCAGG CCCTGA 



SEQ ID NO:144 PFG9 Protein sequence: 

Protein Accession #: none available, FGENESH predicted 

1 11 21 31 41 51 
I I I I I I 

MRAVPLPAPL LFLLLLALLA APAARASRAE S VSAPWPEPE RESRPPPGPG PGNTTRFGSG 60 
AAGGSGSSSS NSSGDALVTR ISILLRDJLPT LKAAVTVAFA FTTLLIACLL LRVFRSGKRL 120 
KKTRKYDOT TPAERVEMAP LNEEDDEDED STVFDIKYRV SLPAALRRQL PGCQTULTVP 180 
VPPPFILDID LPARCSGRPD GGIRPGKTCF PAWWHPVESW SAATWGVKDW TWKPSCVGGV 240 
ETKTNVMYKT PAPSCVSGIC SDCHWQARFH VTTMELLLPP FGHPFKVPPT STPHGFRQLQ 300 
LNLMEKLDSS ALRRNTRAPS ARCLPLVLAE MAAAESDLPN PWWHFSATGS PIKTLYTQTM . 360 
STLGLDVPCG AGQRGTPCED RAVTKVIjQGS SFSKQLRWKP ALESGFPHHL RLLRECPPLS 420 
THPVRLARSD ARGQASLTGR RVFRRPRQSL HGGGS AGTAT CLLVLKILLR RHPHLDLFYK 480 
ICLPCCAVEH LREAKRSSVT VLASFEQSPQ KAAAAHGEPV KRGPSGQLTR HTCPGWGITH 540 
ANUQTTPDTQ GQEGPREDVT HPGGDLDGVA NFYLEEEGFQ DGRCQKMVLM SEEGPPSLTG 600 
CERLTGSHHF SSHSKS WSFL SPRQPLFLSR P 



SEQ ID N0:145 PFG6 DMA SEQUENCE 

Nucleic Add Accessions NWJJ13427 

Coding sequence: 875-3799 (undertined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GGCTGGGCTG CGAATAGCGT GTTCCTCTCC GGCGGAACAC ACACACCCGG CCTTGGGGCT 60 
GTCTCCTGAA GCTCCCTCCT CCACGGAGAG CGCTGAGCGC CGCCGGG AAT TCCATCCCAC 120 
CGTGGGCACG CAGTCTTTGG AGGTCCCGGG CGCAGCACGC TCGGTGTCCC CACACTGCAG 180 
CAAGACAGAG ACCCCGCGGG AACCTTGAGC TTGGAACAAC CCTTGAGCCT CTGCAGTCGG 240 
AAGAGTGGGC GCAGCAGCCC AGCGGAGGCC AGGCGCGCAA CCTCGGGCGC CGGGGCAAGG 300 
AGAGAGTGCA GGG AGGCGCA GCTCAGGCGC CCGGCTCAGG AGCGGGAGGA AGTTCTCGCG 360 
GCGCCGGGAG CGCGGTGGAC GCGCCCTGGG CGCACGCCCA GGCAGCCTTC TCCCTGGCCC 420 
TCGGG ACTGT CCTCGGGCGG CAAGGAGGAG CTTGCTGGAG TCTTAGAGGC CATCCAGAGC 480 
CAGCGAGCAG GAGCGCTGCG TCTCCCGCCT CAGCTAGG AA GGGGGAGTGG CGCTCGCAGG 540 
CTGG AGCTGG GAACCCAGCG AGCGCCTGAC CTTCCTCCTC CTCTTCCTGA CCCTCTTCGC 600 
GTCTTGGGCT CCGGAGG AAG GTTCTAGCGG CTGCAGGAGG TCCCCAG ACC CATTTTCCTA 660 
GAAGGCTGGT G ATGGATCTG CTGCTCCTGC CGCCGCCGGG GCACTTGGAG CGCACCGGCG 720 
GCGCGTGAGC TGGGCTTTGC TCTCCACCGC CCTGGGCAAA CCCCGGGCCA GCCCCGCCTG 780 
GCACCTTTGC CTG AGTCCCT TTOGGTTCCC GACCCAAAGC CACCAGCGTC CAGGGAGGGA 840 
GGAGGAGGTG GTCCTCAGGT GCAGCCCCGC CGAGAJQTCC GCGCAG AGCC TGCTCCACAG 900 
CGTCTTCTCC TGTTCCTCGC CCGCTTCAAG TAGCGCGGCC TCGGCCAAGG GCTTCTCCAA 960 
GAGGAAGCTG CGCCAG ACCC GCAGCCTGG A CCCGGCCCTG ATCGGCGGCT GCGGGAGCGA t020 
CG AGGCGGGC GCGGAGGGCA GTGCGCGGGG AGCCACGGCG GGCCGCCTCT ACTCCCCATC 1080 
ACTCCCAGCC GAGAGTCTCG GCCCTCGCTT GGCGTCCTCT TCCCGGGGTC CGCCCCCCAG 1140 
GGCCACCAGG CTACCGCCTC CTGGACCTCT TTGCTCGTCC TTCTCCACAC CCAGCACCCC 1200 
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GCAGGAGAAG TCACCATCCG GCAGCTTTCA CTTTGACTAT GAGGTTOCCC TGGGTCGCXJG 1260 
CGGCCTCAAG AAG AGCATGG CCTGGGACCT GCCTTCTGTC CTGGCCGGGC CAGCCAGTAG 1320 
CCGAAGCGCT TCCAGCATCC TCTGTTCATC CGGGGGAGGC CCCAATGGCA TCTTCGCTTC 1380 
TCCTAGGAGO TGGCTCCAGC AGAGG AAGTT CCAGTCCCCA CCCGACAGTC GCGGGCACCC 1440 
CTACGTCGTG TGGAAATCCG AGGGTGATTT CACCTGGAAC AGCATGTCAG GCCGCAGTGT 1500 
GCGGCTGAGG TCAGTCCCCA TCCAGAGTCT CTCAGAGCTG GAGAGGGCCC GGCTGCAGGA 1560 
AGTGCCTTTT TATCAGTTGC AACAGGACTG TGACCTGAGC TGTCAGATCA CCATTCCCAA 1620 
AGATGGACAA AAGAGAAAGA AATCTTTAAG AAAGAAACTG GATTCACTAG GAAAGGAGAA 1680 
AAACAAAGAC AAAGAATTCA TCCCACAGGC ATTTGGAATG CCCTTATCCC AAGTCATTGC 1740 
GAATGACAGG GCCTATAAAC TCAAGCAGGA CTTGCAGAGG GACGAGCAGA AAGATGCATC 1800 
TGACTTTGTG GCTTCCCTCC TCCCATTTGG AAATAAAAGA CAAAACAAAG AACTCTCAAG 1860 
CAGTAACTCA TCTCTCAGCT CAACCTCAGA AACACCX5AAT GAGTCAACGT CCCCAAACAC 1920 
CCCGGAACCG GCTCCTCGGG CTAGGAGGAG GGGTGCCATG TCAGTGGATT CTATCACCG A 1980 
TCTTGATGAC AATCAGTCTC GACTACTAGA AGCTTTACAA CTTTCCTTGC CTGCTGAGGC 2040 
TCAAAGTAAA AAGGAAAAAG CCAGAG ATAA GAAACTCAGT CTGAATCCTA TTTACAGACA 2100 
GGTCCCTAGG CTGGTGGACA GCTGCTGTCA GCACCTAGAA AAACATGGCC TCCAG ACAGT 2160 
GGGGATATTC CGAGTTGGAA GCTCAAAAAA GAGAGTGAGA CAATTAGGTG AGGAATTTGA 2220 
CCGTGGG ATT G ATGTCTCTC TGGAGG AGG A GCACAGTGTT CATG ATGTGG CAGCCTFGCT 2280 
GAAAGAGTTC CTGAGGGACA TGCCAGACCC CCTTCTCACC AGGGAGCTGT ACACAGCTTT 2340 
CATCAACACT CTCTTGTTGG AGCCGGAGGA ACAGCTGGGC ACCTTGCAGC TCCTCATATA 2400 
CCTTCTACCT CCCTGCAACT GCGACACCCT CCACCGCCTG CTACAGTTCC TCTCCATCGT 2460 
GGCCAGGCATGCCGATGACA ACATCAGCAA AGATGGGCAA GAGGTCACTG GGAATAAAAT 2520 
GACATCTCTA AACTTAGCCA OCATATTTGG ACCCAACCTG CTGCACAAGC AG AAGTCATC 2580 
AG ACAAAGAA TTCTCAGTTC AGAGTTCAGC CCGGGCTGAG GAGAGCACGG CCATCATCGC 2640 
TGTTGTGCAA AAGATGATTG AAAATTATGA AGCCCTGTTC ATGGTTCCCC CAGATCTCCA 2700 
GAACGAAGTG CTGATCAGCC TGTTAGAGAC CGATCCTGAT GTCGTGG ACT ATTTACTCAG 2760 
AAGAAAGGCT TCCCAATCAT CAAGCCCTGA CATGCTGCAG TCGGAAGTTT CCTTTTCCGT 2820 
GGGAGGGAGG CATTCATCTA CAGACTCCAA CAAGGCCTCC AGCGGAGACA TCTCCCCTTA 2880 
TGACAACAAC TCOCCAGTGC TGTCTGAGCG CTCCCTGCTG GCTATGCAAG AGGACGCGGC 2940 
CCCGGGGGGC TCGGAGAAGC TTTACAGAGT GCCAGGGCAG TTTATGCTGG TGGGCCACTT 3000 
GTCGTCGTCA AAGTCAAGGG AAAGTTCTOC TGGACCAAGG CTTGGG AAAG ATCTGTCAG A 3060 
GGAGCCTTTC GATATCTGGG GAACTTGGCA TTCAACATTA AAAAGCGGAT CCAAAGACCC 3120 
AGGAATGACA GGTTCCTCTG GAGACATTTT TGAAAGCAGC TCCCTAAGAG CGGGGCCCTG 3180 
CTCCCTTTCT CAAGGG AACC TGTCCCCAAA TTGGCCTCGG TGGCAGGGGA GCCCCGCAG A 3240 
GCTGGACAGC GACACGCAGG GGGCTCGGAG GACTCAGGCC GCAGCCCCCG CGACGGAGGG 3300 
CAGGGCCCAC CCTGCGGTGT CGCGCGCCTG CAGCACGCCC CACGTCCAGG TGGCAGGG AA 3360 
AGGCGAGCGG CCCACGGCCA GGTCGGAGCA GTACTTGACC CTGAGCGGCG CCCACGACCT 3420 
CAGCGAGAGT GAGCTGGATG TGGCCGGGCT GCAGAGCCGG GCCACACCTC AGTGCCAAAG 3480 
ACCCCATGGG AGTGGGAGGG ATGACAAGCG GCCCCCGCCTCCATACCCGG GCCCAGGGAA 3540 
GCCCGCGGCA GCGGCAGCCT GGATCCAGGG GCCCCCGGAA GGCGTGGAGA CAOCCACGGA 3600 
CCAGGGAGGC CAAGCAGCCG AGCG AGAGCA GCAGGTCACG CAGAAAAAAC TGAGCAGCGC 3660 
CAACTCCCTG CCAGCGGGCG AGCAGGACAG TCCGCGCCTG GGGG ACGCTG GCTGGCICG A 3720 
CTGGCAGAG A GAGCGCTGGC AGATCTGGGA GCTCCTGTCG ACCGACAACC CCGATGCCCT 3780 
GCCCGAGACG CTGGTCTGAG CCCGCACCCA GCCGAGCCCC CCCTGCCCCG AGCCCCCCGC 3840 
CCTCCAGCCC AGGGGGGACC GTGGGTGGTG GCCACTGGCA CACTTAQTGT TCTTCTTTCA 3900 
CACTTCTCAA AAGTG ACACA AGAGAAATCC AGTTCACCTA CAG AGGTAG A GCACTCAOGC 3960 
OOCCGCCATT G AG AATAAGG TTCCATTGCG TAGCCAGCCT TAGGAAAAAC AAACAGAACC 4020 
CAAACCAGAT GGCAATGTCC AATCTAAAAA CGTCCCTCTT GGCTCTATAA TATAAGATAC 4080 
AACTCTTGCT TGGTATAGCC TAACCGTATT TATGTGTCTT CGGTTTTGAC TATTGTGTAT 4140 
TCTGTAACAG ATTATGTATA ATCATATATG ATATATTCAC AAAGAGAAAA CAAAAGGAAC 4200 
TTTTAAAAAA AAAATCACTT CACTTATATT AAGCAATGAG ATATACTAAA CAATGAGATT 4260 
CTATAGAATG TTCTAGAATG TGCACAAGCG GG1T1C1GTG CTTTTGCCAT AGCTTTATAA 4320 
CTGGGGATAA CCCTTCCTTC GATACCAAAC ACTAACAAGA GGAAGCAGAA TATGAGAAGC 4380 
CATATTTTTA CATAGG AGTC AGATACAAAA AG AAAAATCA CTGAA TGCTT TTAGATATTG 4440 
AATACGTITT CAGGAAAATG CTAAATCTGA TAGATTACGA AATATATTTT TAGAACTTGT 4500 
TTAGAAAGGA TTCAGTTAAC CAAACAAGAA AAAGGCAGTG CCTCACAAAG AAATTAAGAA 4560 
GTTGTCCGTC CCACGTTACA TCAAATTCAG TTTTATATAG GCCATATATA ATATATATTT 4620 
ATAATGTATA ATTTTTATGT ATTTTTCAAA ACTACAAACT GGAATCCAAC TATAAAGTGT 4680 
TTAAGAATCT ACACAG AATA TTCAAATTAT AGAACATGTT TTTTCCCTTT GCCCCATAAT 4740 
CAGTATTTGC CAAATTACAT GCAATTCCTT AAAAACTAAA TCACATTGGT AAAAGGCCTA 4800 
CAGCTTTGTA CTTACATTGT GCCAAAGGCT GAGGAAATGT TTTCTTTCGA ATTTTTATGT 4860 
GTATTGTAAA ATGTTCTACC GTACTTTAGT AGTTTGAAGT TTTCAAGTGC ATAACTATTT 4920 
TTGACCAGCA GAAGGCGATA CGCTTCAGTA TTTTATGCAA TTTTTTTTCA C TTCG AAGGG 4980 
AAAGTGTATT ATAAAAAAAG ATTTTTTTTT TTTAAAACAT GCTACTCTTA ATTTTCATGT 5040 
TGGTGATGAA ATTCCCAGTG GTGTTTCTTA AGGTTCTATC TTGTGCCATG ATGAATAAAA 5100 
AGTTAAGCAA AAAAAAAAAA AAAAAAAAAA AAA 



?EQ ID UQM ppgg PfPt^H SMUgnce; 
Protein Accession*: NPJB8286.1 

1 11 21 31 41 51 

MSAQSLLHSV FSCSSPASSS AASAKGFSKR KLRQTRSLDP ALIGGCGSDE AGAEGSARGA 60 
TAGRLYSPSL PAESLGPRLA SSSRGPPPRA TRLPPPGPLC SSFSTPSTPQ EKSPSGSFHF 120 
DYEVPLGRGG LKKSMAWDLP SVLAGPASSR SASSILCSSG GGPNGIFASP RRWLQQRKPQ 180 
SPPDSRGHPY WWKSEGDFT WNSMSGRS VR LRS VPIQSLS ELERARLQEV PFYQLQQDCD 240 
LSCQITIPKD GQKRKKSLRK KLDSLGKEKN KDKEFIPQAF GMPLSQVIAN DRAYKLKQDL 300 
QRDEQKDASD FVASLLPFGN KRQNKELSSS NSSLSSTSET PN3BSTSPNTP EPAPRARRRG 360 
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AMSVDSITDL DDNQSRLLEA LQLSLPAEAQ SKKEKARDKK LSLNPIYRQV PRLVDSCOQH 420 
LJEKHGLQTVG IFRVGSSKKR VRQLREEEDR GIDVSLEEEH SVHDVAALLK EFLRDMPDPL 480 
LTRELYTAH NTUXEPEEQ LGTLQLLIYL LPPCNCDTLH RLLQFLSIVA RHADDNISKD 540 
GQEVTGNKMT SLNLATIFGP NLLHKQKSSD KEFSVQSS AR AEESTADAV VQKMIENYEA 600 
LFMVPPDLQN EVUSLLETD PDVVDYLLRR KASQSSSPDM LQSEVSFSVG GRHSSTDSNK 660 
ASSGDISPYD NNSPVLSERS LLAMQEDAAP GGSEKLYRVP GQFMLVGHLS SSKSRESSPG 720 
PRLGKDLSEE PFDIWGTWHS TLKSGSKDPG MTGSSGDEFE SSSLRAGPCS LSQGNLSPNW 780 
PRWQGSPAEL DSDTQGARRT QAAAPATEGR AHPAVSR ACS TPHVQVAGKA ERPTARSEQY 840 
LTLSGAHDLS ESELDVAGLQ SRATPQCQRP HGSGRDDKRP PPPYPGPGKP AAAAAWIQGP 900 
PEG VETPTDQ GGQAAEREQQ VTQKKLSSAN SLPAGEQDSP RLGDAGWLDW QRERWQIWEL 960 
LSTDNPDALP ETLV 



SEQ ID NO:147 PFG4 DNA SEQUENCE 

Nucleic Acid Accession #: NM_002202 

Coding sequence: 240-1 289 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

CCCCCGAGCC GCGCCGAGTC TGCCGCCGCC GCAGCGCCTC CGCTCCGCCA ACTCCGCCGG 60 
CTTAAATTGG ACTCCTAGAT CCGCGAGGGC GCGGCGCAGC CGAGCAGCGG CTCTTTCAGC 120 
ATTGGCAACC CCAGGGGCCA ATATTTCCCA CTTAGCCACA GCTCCAGCAT CCTCTCTGTG 180 
GGCTGTTC AC CAACTGTACA ACCACCATTT CACTGTGGAC ATTACTCCCT CITACAG ATA 240 
TgGGAGACAT GGGAG ATCCA CCAAAAAAAA AACGTCTGAT TTCCCTATGT GTTGGTTGCG 300 
GCAATCAG AT TCACGATCAG TATATTCTGA GGGTTTCTCC GG ATTTGGAA TGGCATGCGG 360 
CATGTTTGAA ATGTGCGGAG TGTAATCAGT ATTTGGACGA GAGCTGTACA TGCTTTGTTA 420 
GGGATGGGAA AACCTACTGT AAAAGAGATT ATATCAGGTT GTACGGGATC AAATGCGCCA 480 
AGTCCAGCAT OGGCTTCAGC AAGAACGACT TCGTGATGCG TGCCCGCTCC AAGGTGTATC 540 
ACATCGAGTG TTTCCGCTGT GTGGCCTGCA GCCGCCAGCT CATCCCTGGG GACGAATTTG 600 
CGCTTCGGGA GGACGGTCTC TTCTGCCGAG CAGACCACGA TGTGGTGGAG AGGGCCAGTC 660 
TAGGCGCTGG CGACCCGCTC AGTCCCCTGC ATCCAGCGCG GCCACTGCAA ATGGCAGCGG 720 
AGCCCATCTC CGCCAGGCAG CCAGCCCTGC GGCCCCACGT CCACAAGCAG CCGGAGAAGA 780 
CCACCCGCGT GCGGACTGTG CTGAACGAGA AGCAGCTGCA CACCTTGCGG ACCTGCTACG 840 
CCGCAAACCC GCGGCCAGAT GCGCTCATGA AGGAGCAACT GGTAGAGATG ACGGGCCTCA 900 
GTCCCCGTGT G ATCCGGGTC TGGTTTCAAA ACAAGCGGTG CAAGGACAAG AAGCGAAGCA 960 
TCATG ATG AA GCAACTCCAG CAGCAGCAGC CCAATGACAA AACTAATATC CAGGGGATGA 1020 
CAGG AACTCC CATGGTGGCT GCCAGTCCAG AGAGACACG A CGGTGGCTTA CAGGCTAACC 1080 
CAGTGGAAGT ACAAAGTTAC CAGCCACCTT GGAAAGTACT GAGCGACTTC GCCTTGCAGA 1140 
GTGACATAGA TCAGCCTGCT TTTCAGCAAC TGGTCAATTT TTCAGAAGGA GGAOCGGGCT 1200 
CTAATTCCAC TGGCAGTGAA GTAGCATCAA TGTCCTCTCA ACTTCCAGAT ACACCTAACA 1260 
GCATGGTAGC CAGTCCTATT G AGGCATGAG G AACATTCAT TCTGTATTTT TTTTCCCTGT 1320 
TGGAG AAAGT GGGAAATTAT AATGTCG AAC TCTG AAACAA AAGTATTTAA CG ACCCAGTC 1380 
AATG AAAACT G AATCAAGAA ATG AATGCTC CATG AAATGC ACG AAGTCTG TTTTAATG AC 1440 
AAGGTG ATAT GGTAGCAACA CTGTGAAG AC AATCATGGGA TTTTACTAGA ATTAAACAAC 1500 
AAACAAAACG CAAAACCCAG TATATGCTAT TCAATGATCT TAG AAGTACT GAAAAAAAAA 1560 
GACGTTTTTA AAACGTAGAG GATTTATATT CAAGGATCTC AAAGAAAGCA TTTTCATTTC 1620 
ACTGCACATC TAGAGAAAAA CAAAAATAGA AAATTTTCTA GTCCATCCTA ATCTGAATGG 1680 
TGCTGTTTCT ATATTGGTCA TTGCCTTGCC AAACAGGAGC TCCAGCAAAA GCGCAGGAAG 1740 
AGAGACTGGC CTCCTTGGCT GAAAGAGTCC TTTCAGGAAG GTGGAGCTGC ATTGGTTTGA 1800 
TATGTTTAAA GTTGACTTTA ACAAGGGGTT AATTGAAATC CTGGGTCTCT TGGCCTGTCC 1860 
TGTAGCTGGT TTATTTTTT A CTTTGCCCCC TCCCCACTTT TTTTG AGATC CATCCTTTAT 1920 
CAAGAAGTCT GAAGCGACTA TAAAGGTTTT TGAATTCAGA TTTAAAAACC AACTTATAAA 1980 
GCATTGCAAC AAGGTTACCT CTATTTTGCC ACAAGCGTCT CGGGATTGTG TTTGACTTGT 2040 
GTCTGTCCAA GAACTTTTCC CCCAAAGATG TGTATAGTTA TTGGTTAAAA TGACTGnTT 2100 
CTCTCTCTAT GGAAATAAAA AGGAAAAAAA AAAGGAAACT TTTTTTGTTT GCTCTTGCAT 2160 
TGCAAAAATT ATAAAGTAAT TTATTATTTA TTGTCGGAAG ACTTGCCACT TTTCATGTCA 2220 
TTTGACATTT TTTGTTTGCT G AAGTGAAAA AAAAAG ATAA AGGTTGTACG GTGGTCTTTG 2280 
AATTATATGT CTAATTCTAT GTGTTTTGTC TTTTTCTTAA ATATTATGTG AAATCAAAGC 2340 
GCCATATGTA GAATTATATC TTCAGGACTA TTTCACTAAT AAACATTTGG CATAGAT 



$EQ ID Np;149 PFG4 ProM sequence; 
Protein Accession #: NP_002 193.1 

1 11 21 31 41 51 
I I I I I I 

MGDPPKKKRL 1SLCVGCGNQ IHDQYILR VS PDLEWHAACL KCAECNQYLD ESCTCFVRDG 60 
KTYCKRDYIR LYGIKCAKCS IGFSKNDFVM RARSKVYHIE CFRCVACSRQ UPGDEFALR 120 
EDGLFCRADH D VVERASLGA GDPLSPLHPA RPLQMAAEPI S ARQPALRPH VHKQPEKTTR 180 
VRTVLNEKQL HTLRTCYAAN PRPDALMKEQ LVEMTGLSPR VIRVWFQNKR CKDKKRSIMM 240 
KQLQQQQPND KTNtQGMTGT PMVAASPERH DGGLQANPVE VQSYQPPWKV LSDFALQSDl 300 
DQPAPQQLVN FSEGGPGSNS TGSEVASMSS QLPDTPNSMV ASPIEA 
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SEQ ID NO:149 PFG2 DNA SEQUENCE 

Nucleic Acid Accession!: NMJM1172 

Coding sequence: 39*11 03 (underiined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCGG AGCICT GGCTTGG AG A TTCTCAGTCC TGCGGATCAT GTCCCTAAGG GGCAGCCTCT 60 
CGCGTCTCCT CCAG ACGCGA GTGCATTDCA TCCTGAAGAA ATCCGTCCAC TCCGTGGCTG 120 
TGATAGGAGC CCCGTTCTCA CAAGGGCAGA AAAGAAAAGG AGTGGAGCAT GGTCCCGCTG 180 
CCATAAGAGA AGCTGGCTTG ATG AAAAGGC TCTCCAGTTT GGGCTGCCAC CTAAAAGACT 240 
TTGGAGATTT GAGTTTTACT CCAGTCCCCA AAGATG ATCT CTACAACAAC CTGATAGTGA 300 
ATCCACGCTC AGTGGGTCTT GCCAACCAGG AACTGGCTGA GGTGGTTAGC AGAGCTGTGT 360 
CAGATGGCTA CAGCTGTGTC ACACTGGG AG GAG ACCACAG CCTGGCAATC GGTACCATTA 420 
GTGGCCATGC CCGACACTGC CCAG ACCTTT GTGTrGTCTG GGTTG ATGCC CATGCTG ACA 480 
TCAACACACC CCTTACCACT TCATCAGGAA ATCTCCATGG ACAGCCAGTT TCATTTCTCC 540 
TCAG AGAACT ACAGG ATAAG GTACCACAAC TCCCAGGATT TTCCTGGATC AAACCTTGTA 600 
TCTCTTCTGC AAGTATTGTG TATATTGGTC TGAG AGACGT GGACCCTCCT GAACATTTTA 660 
TTTTAAAGAA CTATGATATC CAGTATTTTT CCATGAGAG A TATTGATCG A CTTGGTATCC 720 
AGAAGGTCAT GG AACGAACA TTTGATCTGC TGATTGGCAA GAGACAAAGA CCAATCCATT 780 
TGAGTTTTGA TATTG ATGCA TTTG ACCCTA CACTGGCTCC AGGCACAGG A ACTOCTGTTG 840 
TOGGGGG ACT AACCTATCG A G AAGGCATGT ATATTGCTGA GG AAATACAC AATACAGGGT 900 
TGCTATCAGC ACTGGATCTT GTTGAAGTCA ATCCTCAGTT GGCCACCTCA G AGGAAGAGG 960 
CGAAGACTAC AGCTAACCTG GCAGTAGATG TGATTGCTTC AAGCTTTGGT CAGACAAGAG 1020 
AAGGAGGGCA TATTGTCTAT GACCAACTTC CTACTCCCAG TTCACCAGAT GAATCAGAAA 1080 
ATCAAGCACG TGTG AGAATT JAGGAGACAC TGTGCACTGA CATGTTTCAC AACAGGCATT 1140 
OCAGAATTAT GAGGCATTGA GGGG ATAGAT GAATACTAAA TGGTTGTCTG GGTCAATACT 1200 
GCCTTAATGA GAACATTTAC ACATTCTCAC AATTGTAAAG TTTCCCCTCT ATTTTGGTGA 1260 
CCAATACTAC TGTAAATGTA TTTGGTTTTT TGCAGTTCAC AGGGTATTAA TATGCTACAG 1320 
TACTATGTAA ATTTAAAGAA GTCATAAACA GCATTTATTA CCTTGGTATA TCATACTGGT 1380 
CTTGTTGCTG TTGTTCCTTC ACATTTAAGT GGTTTTTCAT CTTTCCTCCC TCCTCCCACA 1440 
GOCTGGCTAT ACAGTGCATC CTTGAACTGT CAGCCCACAG CAGCAATATG CTTATTCTAT 1500 
CCACATCCCT AACATCATGC ATTCACAAGG TCAAAGTTCT GGTCCACAAA CCCTTCCCTA 1560 
TAGAAGTTCA ATGGCTGCG A AAGAATTTGT AGTAAACCAG GCCTCCCAGG ATGGCG AGCT 1620 
CCAGTAAG AT G ATAATGG AA AGCAGCAGCT TGTTGGTTGT CACTCTACAA AGAG AAGCAA 1680 
AGTGGGGAGT AGTCAGAAGT TTGGATAACC TTCCTTCTAA ACATTTGGGG GTTAG ACCTG 1740 
GGACCACGGC TGG ATACTCT GAGGCTGTAT GTTTG ATCAC ACAGCCACTT AGCAGGAAGT 1800 
ACTCATAAGG TTCTTTAGCT GTCACTTAGG GATAACACTG TCTACCTCAC AGAAATGTTA 1860 
AACTGAGACA ATAAAACCCA AAGCAT 



SEQ ID NO:150 PFG2 Protein sequence: 
Protein Accession*: NPJJ01163.1 

1 11 21 31 41 51 
I I I I I I 

MSLRGSLSRL LQTR VHSILK KSVHSVAVIG APFSQGQKRK GVEHGPAAJR EAGLMKRLSS 60 
LGCHLKDPGD LSFTPVPKDD LYNNUVNPR SVGLANQELA EWSRAVSDG YSCVTLGGDH 120 
SLA1GTISGH ARHCPDLCV V WVDAHADINT PLTTSSGNLH GQPVSFLLRE LQDKVPQLPG 180 
FSWIKPClSS ASIVYIGLRD VDPPEHF1LK NYDIQYFSMR DLDRLGIQKV MERTFDLUG 240 
KRQRPIHLSF DIDAFDPTLA PATGTPWGG LTYREGMYIA EEIHNTGLLS AU0LVEVNPQ 300 
LATSEEEAKT TANLAVDVIA SSFGQTREGG HIVYDQLPTP SSPDESENQA RVRI 



SEQ ID N0:151 PFG1 DNA SEQUENCE 

Nucleic Acid Accession*: NMJ)17906 

Coding sequence: 80-1255 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I m 

AATTATATAT TTTTACTCTA TGTTTCTCTA CATGTmTT TCTTTCCGTT GCTGGCGGAA 60 
GAGGCACGTG CGCTGCTGAAJ2GAGCTGGT CGCTGGTTGC TACGAGCAGG TCCTCTTTGG 120 
GTTCGCTGTA CACCCGGAGC OCAAGGCTTG CGGCGACCAC GAGCAATGGA CTCTTGTGGC 180 
TGACTTCACT CACCATGCTC ACACTGCCTC CTTGTCAGCA GTAGCTGTAA ATAGTCGTTT 240 
TGTGGTCACT GGGAGCAAAG ATGAAACAAT TCACATTTAT GACATGAAAA AGAAGATTGA 300 
GCATGGGGCT CTAGTGCATC ACAGTGGTAC AATAACTTGC CTGAAATTCT ATGGCAACAG 360 
GCATTTAATC AGTGGAGCGG AAGATGG ACT CATCTGTATC TGGGATGCAA AGAAATGGGA 420 
ATGCCTGAAG TCAATTAAAG CTCACAAAGG ACAGGTGACC TTCCTTTCTA TTCACCCATC 480 
TGGCAAGTTG GCCCTGTCGG TTGGTACAGA TAAAACTTTA AGAACGTGGA ATCTTGTAGA 540 
AGGAAGATCA GCATTCATAA AAAATATAAA ACA AAATGCT CACATAGTAG AATGGTCCCC 600 
AAGAGGAGAG CAGTATGTAG TTATCATACA GAATAAAATA GACATCTATC AGCTTGACAC 660 
TGCATCCATT AGTGGCACCA TCACAAATGA AAAGAGAATT TCCTCTGTTA AATTTCTTTC 720 
AGAGTCTGTC CTTGCAGTGG CTGGAGATGA AG AAGTTATA AGGTTTTTTG ACTGTGATTC 780 
ACTAGTGTGC CTCTGCGAAT TTAAAGCTCA TGAAAACAGG GTAAAGGACA TGTTCAGTTT 840 
TG AAATTCCA GAGCATCATG TTATTGTTTC AGCATCGAGT GATGGTTTCA TCAAAATGTG 900 
GAAGCTTAAG CAGG ATAAG A AAGTTCCCCC ATCTTTACTC TGTGAAATAA ACACTAATGC 960 
CAGGCTG ACG TGTC1TGGAG TGTGGCTAG A CAAAGTGGCA GACATGAAAA GCCTTCCTCC 1020 
AGCTGCAGAG CCTTCTCCTG TAAGTAAAGA ACAGTCCAAA ATTGGCAAAA AGGAGCCTGG 1080 
TGACACAGTG CACAAAG AAG AAAAGCGGTC AAAACCTAAC ACAAAGAAAC GCGGTTTAAC 1140 
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AGGTGACAGT AAGAAAGCAA CAAAAGAAAG TGGCCTGATA TCAACCAAGA AGAGGAAAAT 1200 
GGTAGAAATG TTGGAAAAGA AGAGGAAAAA GAAGAAAATA AAAACAATGC AGTGAATCAC 1260 
AG ATGTCTCC TGAAAGAACT CTTTTAGATG AAATCATTCT ACTCAAATGT ACCTTAATTT 1320 
1111 11 1 ICC CTGAGTAAAA GCAAGAAATT TCTTCCTTTG GAAAAAATAT ATATATTAAA 1380 
AAACCACTTT TAGATGGTTT TTTTTAAAAA AAAAAAAAAA ACTGGTAAAA TTACTTTTGG 1440 
CAGACAGTGT TTTATGAATT ATGTATCATG TTGATATATA ATATGTTAAT GTGTCATGTA 1500 
ATTnTACTT TGTAC A AAGC AAATAAAG AT CTTTCTCAAA AAAAAAAAAA AAAA 



SPQIPNWBPrai PraWnMOTTOS 
Prolan Accession!: NP.060376.1 

1 11 21 31 41 51 
I I I I I I 

MELVAGCYEQ vlfgea vhpe pkacgdheqw TLVADFTHHA HTASLSAVAV NSRFVVTGSK 60 
DETIHIYDMK KKJEHGALVH HSGTTTCLKF YGNRHLISGA EDGUCIWDA KKWECLKSIK 120 
AHKGQVTFLS IHPSGKLALS VGTDKTLRTW NLVEGRSAFI KN1KQNAHIV EWSPRGEQYV 180 
VHQNKIDIY QLDTASISGT ITNEKRISSV KFLSESVLAV AGDEEVIRFF DCDSLVCLCE 240 
FKAHENRVKD MFSFEIPEHH VIVSASSDGF IKMWKLKQDK KVPPSLLCO NTNARLTCLG 300 
VWLDKVADMK SLPPAAEPSP VSKEQSKIGK KEPGDTVHKE EKRSKPNTKK RGLTGDSKKA 360 
TKESGLISTK KRKMVEMLEK KRKKKKJKTM Q 



SEQ (0 N0:153 PFD6 DNA SEQUENCE 

Nucleic Acid Accession*: NM.014668 

Cooing sequence: 1 1 0-2953 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GATGTCTTGG ACATGCTCTG GCTGGCTAAT CTCCATGTTC TAGCCGACTG AAAATACGGT 60 
GGCCAAGTGG ATGGTGTGCT TATTTGCAGT CTAAAGAAAT TrCCTTTTGATQTGGCAGAA 120 
AATCGAGGAT GTGGAGTGGA GACCCCAG AC TTACTTGGAG CTGGAGGGTC TGCCTTGCAT 180 
CCTGATCTTC AGTGGG ATGG ACCCGCATGG GGAGTCCTTG CCGAGGTCTT TG AGGTACTG 240 
TGACCTGCGA TTGATAAACT CCTCCTGCTT GGTGAGAACA GCCTTGGAGC AGGAGCTGGG 300 
CCTGGCTGCC TACTTTGTGA GCAACGAGGT TCCCTTGGAG AAGGGGGCTA GGAACG AGGC 360 
CTTGGAGAGT GATGCTGAGA AGCTGAGCAG CACAG ACAAC GAGGATGAGG AGCTGGGGAC 420 
AGAAGGCTCT ACCTCGGAGA AG AGAAGCCC CATG AAAAGG GAGAGGTCCC GCTCCCACGA 480 
CTCAGCATCC TCATCCCTCT CCTCCAAGGC TTCCGGTTCA GCGCTCGGTG GCGAGTCCTC 540 
GGCTCAGCCC ACAGCACTCC CCCAGGG AGA GCATGCCAGG TCGCCCCAGC CCCGTGGCCC 600 
CGCAGAGGAG GGCAGAGCCC CTGGTG AGAA ACAGAGGCCC CGGGCAAGTC AGGGGCCACC 660 
CTCGGCCATC AGCAGGCACA GTCCCGGGCC GACGCCCCAG CCCGACTGTA GCCTCAGGAC 720 
CGGCCAGAGG AGCGTCCAGG TGTCGGTCAC CTCGTCGTGC TCCCAGCTGT CCTCCTCCTC 780 
GGGCTCATCC TCCTCATCCG TGGCGCCCGC TGCCGGCACG TGGGTCCTGC AGGCCTCCCA 840 
GTGCTCCTTG ACCAAGGCCT GCCGCCAGCC ACCCATTGTC TTCITGCCCA AGCTCGTGTA 900 
CGACATGGTT GTGTCCACTG ACAGCAGTGG CCTGCCCAAG GCCGCCTOOC TCCTGCCCTC 960 
CCCCTOGGTC ATGTGGGCCA GCTCTTTCOG CCCCCTGCTC AGCAAGAOCA TGACATCCAC 1020 
CGAGCAGTCC CTCTACTACC GGCAGTGGAC GGTGCCCCGG CCCAGOCACA TGG ACTACGG 1080 
CAACCGGGCC GAGGGCCGCG TGG ACGGCTT CCACCCCCGC AGGCTGCTGC TCAGCGGCCC 1140 
COCTCAGATC GGGAAGACAG GTGCCTACCT GCAGTTCCTC AGTGTCCTGT CCAGGATGCT 1200 
TGTTCGGCTC ACAGAAGTGG ATGTCTATGA CGAGGAGGAG ATCAATATCA ACCTCAGAGA 1260 
AGAATCTGAC TGGCATTATC TCCAGCTTAG CGACCCCTGG CCAGACCTGG AGCTGTTCAA 1320 
GAAGTTGCCC TTTGACTACA TCATTCACGA CCCGAAGTAT GAAG ATGCCA GCCTGATTTG 1380 
TTCGCACTAT CAGGGTATAA AGAGTGAAGA CAGAGGGATG TCCCGGAAGC CGGAGGACCT 1440 
TTATGTGCGG CGTCAGACGG CAOGGATGAG ACTGTCCAAG TACGCAGCGT ACAACACTTA 1500 
CCACCACTGT GAGCAGTGCC ACCAGTACAT GGGCTTCCAC CCCCGCTACC AGCTGTATG A 1560 
GTCCACCCTG CACGCCTTTG CCTTCTCITA CTCCATGCTA GGAGAGGAGA TCCAGCTGCA 1620 
CTTCATCATC CCCAAGTCCA AGGAGCACCA CTTTGTCTTC AGCCAACCTG GAGGCCAGCT 1680 
GGAGAGCATG CGACTACCCC TCGTGACAGA CAAGAGCCAT GAATATATAA AAAGTCCGAC 1740 
ATTCACTCCA ACCACCGGCC GTCACG AACA TGGGCTCTTT AATCTGTACC ACGCAATGGA 1800 
CGGTGCCAGC CATTTGCACG TGCTGGTTGT CAAGGAATAC GAGATGGCAA TTTATAAGAA 1860 
ATATTGGCCC AACCACATCA TGCTGGTGCT CCCCAGTATC TTCAACAGTG CTGGAGTTGG 1920 
TGCTGCTCAT TTCCTCATCA AGGAGCTGTC CTACCATAAC CTGGAGCTCG AGCGGAACCG 1980 
GCAGGAGGAG CTGGGAATCA AGCCGCAGGA CATCTGGCCT TTCATTGTGA TCTCTGATGA 2040 
CTCCTGCGTG ATGTGGAACG TGGTGGATGT CAACTCTGCT GGGGAGAGAA GCAGGGAGTT 2100 
CTCCTGGTCG GAAAGGAACG TGTCTTTGAA GCACATCATG CAGCACATCG AGGCGGCCCC 2160 
CGACATCATG CACTACGCCC TGCTGGGCCT GCGGAAGTGG TCCAGCAAGA CCCGGGCCAG 2220 
CGAGGTGCAA GAGCCCTTCT CCCGCTGCCA CGTGCACAAC TTCATCATCC TGAACGTGGA 2280 
CCTGACCCAG AACGTGCAGT ACAACCAGAA CCGGTTCCTG TGTGACGATG TAGACTTCAA 2340 
CCTGCGGGTG CACACCGCCG GCCTCCTGCT CTGCCGGTTC AACCGCTTCA GCGTGATG AA 2400 
GAAGCAG ATC GTGGTGGGCG GCCACAGGTC CTTCCACATC ACATCCAAGG TGTCTGATAA 2460 
CTCTGCCGCG GTCGTGCCGG CCCAGTACAT CTGTGCCCCG GACAGCAAGC ACACGTTCCT 2520 
CGCAGOGCCC GCCCAGCTCC TGCTGGAGAA GTTCCTGCAG CACCACAGCC ACCTCTTCTT 2580 
CCCGCTGTCC CTGAAGAACC ATGACCACCC AGTGCTGTCT GTCGACTGTT ACCTG AACCT 2640 
GGGATCTCAG ATTTCTGTTT GCTATGTG AG CTCCAGGCCC CACTCTTTAA ACATCAGCTG 2700 
CTCGGACTTG CTGTTCAGTG GGCTGCTGCT GTACCTCTGT GACTCTTTTG TGGGAGCTAG 2760 
CT1 1 1 ' lO AAA AAGTTTCATT TTCTGAAAGG TGCGACGTTG TGTGTCATCT GTCAGGACCG 2820 
GAGCTCACTG CGCCAGACGG TCGTCCGCCT GGAGCTCGAG G ACG AGTGGC AGTTCCGGCT 2880 
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GCGCGATGAG TTCCAGACCG CCAATGCCAG GGAAGACCGG CCGCTCTTTT TTCTG ACGGG 2940 
ACGACACATC TGAGGAAGAC AGCX3GCGAGT TTTCTGAAGA GATGAGTGCT CAGAGCCCTC 3000 
ATGCTGTTGA GGCTAAAGGG AGGCCTGGAA CGGTGGGGCG TTTGACTGGA ATGGACCCCA 3060 
GGGACTGTCC AGGTGCAGCC CCTCCTAGTA CACATGGGCC CCCGAGGCCG TGGTCCTGGG 3120 
AGCCAGGAAG ACTCCGCAGT GGGTG AGAAT GAAAACTTGA GACTCCCAAG TTCTGGGCCA 3180 
GCCCATTGCT CTGGGCTGTT TTAAAGCCCA TTTCAGGAGG AACAAAGATT TACTTCCTGT 3240 
CCTGCCATTC GTGTGCTTCC ATGGACAAAC CTGATTTTTT TCTCTTAGTT CTAAAGAATC 3300 
TTGGGTTATT TTGTAGCGGT GCCAGTATTT CAGTAGATGG GATTTCAGCC AAGTAGGTTC 3360 
CCCTGTAACC TCCTACAAAG CAATATTCCA AAGGAACATT TTAACTGTAA AGGCTGG AGA 3420 
CAAGAAAAAA TAAGTAGATC GTTTTAATAA CAATTATTTA ATTGCCTATA AGTTTGCTGT 3480 
TTCAGAGGCT AGCCCAAAGG CATCAAATTT AATAAAGTTA AACAAATTGA TTTACTTCAG 3540 
AGCAAATATG ATCCTATTAA AATAATATAG GGTAAATACC CTACCTCTTA GAAAGGGCAA 3600 
AAATGCAAAG AAGCTTTCT T TAAAACTAAA AGGGTTTTTT GGGGGGGGAG TTGGCGGGGA 3660 
GGAAATAAGG CTAACAGAGG TTGACCTAAA ATTAGCCTTA CAAAGGAGAA AGGACCACAT 3720 
TGCTTACTTG AAACAGACAA TGAAAACAAC CAAAGTGATA TATAAAATAG TTGATGAGAA 3780 
CTAGACTTAT GACTGTAGTT TACTAGAGTT TAGTTTTCAG TTGCTGAAGT AGCTCATTTT 3840 
CTCTTACTAA TGTTTGGTTC CTCAGGGAAG AATCTCACTT G ACTAG AG AG GAGGTGGGAA 3900 
CAGAAGAGAG AAGG AGGCA G GGAGATGTA T TTCTT AGGGC TCACCCCTTC ACAGACTGAC 3960 
AGAATGGTTT TGTTTTGTTT TGl i 1 iGTTT TGTTTTGTTT TTGAGATGGA CTCTAGCTCT 4020 
GTCACCCAGG CTGG AGTGCA GTGGTGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGGG 4080 
TTCTCACCAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG ACTACAGGCG (XCACCACCA 4140 
CGCCCGGCTA ATiTn 1G TA i l l 1 1 1A GTA GAGACGGGGT TTCACCATGT TAGCCAGGAT 4200 
GGTCTCGATC TCCTGACCTC GTG ATCCGCC CGCCTCGGCC TCCCAAAGTG CTGGGATTAC 4260 
AGGOGTGAGC CACCGTGCCT GCOCCAG AAT GGTTTTTAAA GCCACAGTTG AGAGGCCACC 4320 
CATTGCCCGG CGCCTGGACA GTGATCATCT TGTTCATCTT GTTCAGTCCT TTCTTGTGTG 4380 
ATTGGAATTA TTCATCCCCT TTGAAAGATG AG AAGGTTGA GATGCAAAGA GTCTACCTTT 4440 
CCAAGTTCTC ACTGCTGGAA AGAGCTAGAA GCACAGTTCA AAGTTCTGGC TTCTGG ACTC 4500 
TGCAGTCCAG GTCTCCCTTC TCCCACTTGC CTACCCTCAA TGCCACACTG TTTTTGAAGT 4560 
GGCCCATAAC TTGAAGGAAA AGTTTAAAGA CAGTTCAATT TAATCATCAG AATGCATTCT 4620 
TITTTTTTTC GG AGACGGAG TTTCACTCTT GCTGCCCAGG CTGGAGTGCA ATGGTGCAAT 4680 
GATCTCGGCT CACTGCAACC TCTGCCTCCT GGGTTCAAGT GATTCTCCAG CCTCAGCCTC 4740 
COGAGTAGCT GGGATTATGG GCGCCCACCA CCATGCCCAG CTAATTTTTG TATTTTTTTT 4800 
TTTTAGTAGA GATGGGGTTT CGCCAGGTTG GCCAGGCTGG TCTTGTGAAC TCCTGGCCTC 4860 
AGGTGATCTG CCCACCTCAT CCTCCAAAAG TGCTGGGATT ACAGGCATGA GCCACTGCGC 4920 
CTGGCCTCAG AATGCATTCT TACACATCTA TCCT AGACAT TTATAAGCAC TCTAATGGAT 4980 
AACAATCCAA GAATAAATGA TTGTAAAAGA TGATGCCGAA GAGTTG ATGT CAATCTTTTT 5040 
TTCCTAAGAA AAAAAGTOCG CG AGTATTAA ATATTTAGAT CAATGTTTAT AAAATGATTA 5100 
CTTTGTATAT CTCATTATTC CTATTTTGGA ATAAAAACTG ACCTTCTTTA ATCATATACT 5160 
TGTCTTTTGT AAATAGCAGC TTTTGTGTCA TTCTCCCCAC TTTATTAGTT AATTTAAATT 5220 
GGAAAAAACC CTCAAACTAA TATTCTTGTC TGTTCCAGTC TTATAAATAA AACTTATAAT 5280 
GCATG 



SEQ ID NO:1S4 PFD6 Protein sequence: 
Protein Accession #: NP_0S54B3.1 

1 11 21 31 41 51 
I I I I I I 

MWQKJEDVEW RFQTYLELEG LPCHJFSGM DPHGESLPRS LRYCDLRUN SSCLVRTALE 60 
QELGLAAYFV SNEVPLEKGA RNEALESDAE KLSSTDNEDE ETjGTEGSTSE KRSPMKRERS 120 
RSHDSASSSL SSKASGSALG GESSAQPTAL PQGEHARSPQ PRGPAEEGRA PGEKQRPRAS 180 
QGPPSAISRH SPGPTPQPDC SLRTGQRSVQ VSVTSSCSQL SSSSGSSSSS VAPAAGTWVL 240 
QASQCSLTKA CRQPPIVFLP KLVYDMVVST DSSGLPKAAS LLPSPSVMWA SSFRPLLSKT 300 
MTSTEQSLYY RQWTVPRPSH MDYGNRAEGR VDG FHPRRLL LSGPPQIGKT GAYLQFLSVL 360 
SRMLVRLTEV DVYDEEEINI NLREESDWHY LQLSDPWPDL ELFKKLPFDY HHDPKYEDA 420 
SLICSHYQGI KSEDRGMSRK PEDLYVRRQT ARMRLSKYAA YNTYHHCEQC HQYMGFHPRY 480 
QLYESTLHAF AFSYSMLGEE IQLHFIIPKS KEHHFVFSQP GGQLESMRLP LVTDKSHEYI 540 
KSPTFTPTTG RHEHGLFNLY HAMDGASHLH VLWKEYEMA IYKKYWPNHI MLVLPSIFNS 600 
AGVGAAHFU KELSYHNLEL ERNRQEELGI KPQDIWPFIV ISDDSCVMWN WDVNSAGER 660 
SREFS WSERN VSLKHIMQHI EAAPDIMHYA LLGLRKWSSK TRASEVQEPF SRCHVHNFH 720 
LNVDLTQNVQ YNQNRFLCDD VDFNLRVHSA GLLLCRFNRF SVMKKQIWG GHRSFHTTSK 780 
VSDNS AAWP AQYICAPDSK HTFLAAPAQL LLEKFIjQHHS HLFFPLSLKN HDHPVLSVDC 840 
YXNLGSQ1SV CYVSSRPHSL NISCSDLLFS GLLLYLO)SF VGASFLKKFH FLKGATLCVI 900 
CQDRSSLRQT WRLELEDEW QFRLRDEFQT ANAREDRPLF FLTGRHI 



SEQ 10 NO:155 PFC6 DNA SEQUENCE 

Nucteic Acid Accession r. NMJH0522 

Cooing sequence: 1-1 167 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

A3SACAGCCT CXX5TGCTCCT CCACCCCCGC TGG ATCGAGC CCACCGTCAT GTTTCTCTAC 60 
GACAACGGCG GCGGCCTGGT GGCCGAOGAG CTCAACA AGA ACATGGAAGG GGCGGCGGCG 120 
GCTGCAGCAG CGGCTGCAGC GGCGGCGGCT GCCGGGGCCG GGGGCGGGGG CTTCCCCCAC 180 
CCGGCGGCTG CGGCGGCAGG GGGCAACTTC TCGGTGGCGG CCGCGGCCGC GGCTGCGGCG 240 
GCCGCCGCGG CCAACCAGTG CCGCAACCTG ATGGCGCACC CGGCGCCCTT GGCGCCAGGA 300 
GCCGCGTCCG CCTACAGCAG CGCCCCCGGG G AGGCGCCCC CGTCGGCTGC CGCCGCTGCT 360 
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PCT7US01/32045 



GCCGCGGCTG CCGCTGCAGC CGCCGCCGCC GCCGCCGCGT CGTCCTCGGG AGGTCCCGGC 420 
CCGGCGGGCC CGGCGGCGGC AGAGGCGGCC AAGCAATGCA GCCCCTGCTC GGCAGCGGCG 480 
CAGAGCTCGT CGGGGCCCGC GGCGCTGCCC TATGGCTACT TCGGCAGCGG CTACTACCCG 540 
r TGCGCCCGCA TGGGCCCGCC CCCCAACGCC ATCAAGTCGT GCCCCCAGCC CCCCTCGGCC 600 
5 GCCGCCGCCG CCGCCTTCGC GGACAAGTAC ATGGATACCG CCGGCCCAGC TGCCG AGGAG 660 
TTC AGCTCCC GCGCTA AGG A GTTCGCGTTC TACCACCAGG GCTACGCAGC CGGGCCTTAC 720 
CACCACCATC AGCCCATGCC TGGCTACCTG GATATGCCAG TGGTGCCGGG CCTCGGGGGC 780 
CCGGGCGAGT CGCGCCACGA ACCCTTGGGT CTTCCCATGG AAAGCTACCA GCCCTGGGCG 840 
CTGCCCAACG GCTGGAACGG CCAAATGT AC TGCCCCAAAG AGCAGGCGCA GCCTCCCCAC 900 

10 CTCTGGAAGT CCACTCTGCC CG ACGTGGTC TCCCATCCCT CX5GATGCCAG CTCCTATAGG 960 

AGGGGGAGA A AGAAGCGCGT GCCTTATACC AAGGTGCAAT TAAAAGAACT TGAACGGGAA 1020 
TAOGCCACGA ATAAATTCAT TACTAAGGAC AAACGG AGGC GGATATCAGC CACGACGAAT 1080 
CTCTCTGAGC GGCAGGTCAC AATCTGGTTC CAGAACAGGA GGGTTAAAGA GAAAAAAGTC 1 140 

. ATCAACAAAC TGAAAACCAC TAGTTAA 



SEQ ID NO:156 PFC6 Protein sequence: 
Protein Accession* NPJXW513.1 

20 1 11 21 31 41 51 
I I I I I I 

MTASVLLHPR WIEPTVMFLY DNGGGLVADE LNKNMEG AAA A AAAAAAAAA AGAGGGGFFH 60 
PAAAAAGGNF SVAAAAAAAA AAAANQCRNL MAHPAPLAPG AASAYSS APG EAPPSAAAAA 120 
AAAAAAAAAA AAASSSGGPG PAGPAAAEAA KQCSPCSAAA QSSSGPAALP YGYFGSGYYP 180 
25 CARMGPPPNA IKSCPQPPSA AAAAAFADKY MDTAGPAAEE FSSRAKEFAF YHQGYAAGPY 240 

HHHQPMPGYL DMPWPGLGG PGESRHEPLG LPMESYQPWA LPNGWNGQMY CPKBQAQPPH 300 
LWKSTLPDW SHPSDASSYR RGRKKRVPYT KVQLKELERE YATNKFITKD KRRRISATTN 360 
LSERQVTIWF QNRRVKEKKV INKLKTTS 

30 

SEQ tD N0:157 PFA3 DNA SEQUENCE 

Nucleic Acid Accession*: AW102723 
^ Coding sequence: 523-2676 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 
A ^ CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 
40 TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGOGGAGGAC 180 
ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACOC CAGCCGGGCG TGATCTCACC 240 
ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 
GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GG AGAAAGCG TG AGCAGGGG GCCACCGCGG 360 
TCTCCGGCCT GTCTGCACCC TGTCGCCTG A GCTGCCTG AC AGTG ACAATG ACATCCCAGT 420 
45 TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 
AAGG ATCTCA AG ATCACAGG AGAGTGTCCT TTCKXTTAC TGGCACCAGG TCAAGTTCCT 600 
AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 
„ TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 
50 AGCCG AGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 

GAAOGGCTGA ATGTTGCACT TCAG AG AACA TTGGCAAAGC ACAAAATAAA AG AAAGCAGG 840 
AAATCTTTGG AAAGAGAAGA CTTTG AAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 
CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGGAAG 960 
ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 
55 CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 
TOCATTCTAT GCCIX3G ATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 
AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 
ACGGAAGTGG AAGTGTCGTT AATGOCTCOC TGCTTCCATA ATGATTGCAG CGAGTTTGTG 1260 
rrx AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 
60 AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCT AT TCTGCAAG AC ATTTCCATTC 1380 
CATTTCATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 
ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 
AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 
„ GTG AGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 
65 ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTO TGTGGACAGA 1680 
TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 
AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 
GGGAAGCTGA AGGCTACCCT TG AGCAAGCC CACCAAGCCC TGG AGGAGGA GAAG AAAAAG I860 
ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 
70 CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 
TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 
TACACTCGCTTCGACGAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 
ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAG AGTG ATACTCATGC TGTTCAG ATA 2160 
GCGCTGATGG CCCTGAAGAT G ATGG AGCTC TCTGATGAAG TTATGTCTCC CCATGGAG AA 2220 
75 CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 
AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 
TGCAGTGTAC CACG AAAAAT CAATGTCAGC OCAACAACTT ACAGATTACT CAAAGACTGT 2400 
CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTTCCCTAGTGAA 2460 
ATCCCCGGAA TCTGCCATTT TCTGG ATGCT TACCAACAAG G AACAAACTC AAAACCATGC 2520 
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TTCCAAAAGA AAG ATGTGGA AG ATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 
TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTC ATTG AA GATGTGTAGA 2640 
GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCT_AA£AAG CAGTATTAAA ATTTCAGGAG 2700 
CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 
TCTTCA AGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 
AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AG ATAATTGT 2880 
AGTCAATTGT ACAAACTGAT GG AGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 
TTATTAAAGT GTGTTTGTG A TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 



gEQ |D NP;159 Pro^P sequence; 
Protein Accession* NP.000847.1 



10 
15 

1 11 21 31 41 51 
I I I I I I 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 
QRKTSRSRVY LHTLAESICK LIFPEFERLN VALQRTLAKH KIKESRKSLE REDFEKTIAE 120 
20 QAVQQSPVEL SKNLLVKRFL KYVTRKMKTS LGWLEAPLKI FKQLQYPSET EQPLPRSRKK 180 
GQLEDASILC LDKEDDFLHV YYFFPKRTTS LILPGIIKAA AHVLYETEVE VSLMPPCFHN 240 
DCSEFVNQPY IXYSVHMKST KPSLSPSKPQ SSLVIPTSLFCKTFPFHFMF DKDMTHjQFG 300 
NGIRRLMNRR DFQGKFNFEY FHLTPKINQ TFSGIMTMLN MQFWRVRRW DNS VKKSSR V 360 
MDLKGQMIYI VESSAILFLG SPCVDRLEDF TGRGLYLSDI PIHNALRDVV LIGEQ ARAQD 420 
25 GLKKRLGKLK ATLEQAHQAL EEEKKKTVDL LCSIFPCEVA QQLWQGQWQ AKKFSNVTML 480 
FSDIVGFTAI CSQCSPLQVI TMLNALYTRF DQQCGELDVY KVETIAMPIV WLGGLHKESD 540 
THAVQIALMA LKMMELSDEV MSPHGEPIKM RIGLHSGSVF AG WGVKMPR YCLFGNNVTL 600 
ANKFESCSVP RKINVSPTTY RUJCDCPGFV FTPRSREELP PNFPSEIPGI CHFLDAYQQG 660 
TNSKPCPQKK DYED ASQFFR QSIRNRLATY XPIYKSLGFD SLKMCRASES TDGIVDG 



SEQ 10 NO:159 PFA1 ONA SEQUENCE 

Nucleic Acid Accession f: NMJXM362 

Coding sequence: 102-1934 {underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CGCCGGCGGG ACTGGTCTGA AGAGACGCGG GGACAAAGTG GCAACGACTT GG ACATCTGA 60 
GCTGTCACTG COG AAAACAG GCCGCAAGAG AGATAATCAA TATGCATTTC CAAGCCTTTT 120 

40 GGCTATGTTT GGGTCTTCTG TTCATCTCAA TTAATGCAGA ATTTATGGAT GATGATGTTG 180 

AGACGG AAGA CTTTGAAGAA AATTCAGAAG AAATTGATGT TAATG AAAGT GAACTTTCCT 240 
CAGAGATTAA ATATAAG ACA CCTCAACCTA TAGG AGAAGT ATATTTTGCA GAAACTTTTG 300 
ATAGTGGAAG GTTGGCTGGA TGGGTCTTAT CAAAAGCAAA GAAAGATGAC ATGGATGAGG 360 
AAATTTCAAT ATACGATGGA AGATGGGAAA TTGAAG AGTT GAAAGAAAAC CAGGTACCTG 420 

45 GTGACAGAGG ACTGGTATTA AAATCTAGAG CAAAGCATCA TGCAATATCT GCTGTATTAG 480 
CAAAACCATT CATTTTTGCT G ATAAACCCT TGATAGTTCA ATATGAAGTA AATTTTCAAG 540 
ATGGTATTGA TTGTGGAGGT GCATACATTA AACTCCTAGC AGACACTGAT GATTTG ATTC 600 
TGGAAAACTT TTATG ATAAA ACATCCTATA TCATTATGTT TGGACCAG AT AAATGTGG AG 660 
AAGATTATAA ACrTCATTTT ATCTTCAGAC ATAAACATCC CAAAACTGGA GTTTTCGAAG 720 

50 AG AAACATGC CAAACCTCCA GATGTAGACC TTAAAAAGTT CTTTACAGAC AGGAAGACTC 780 
ATCTTTATAC CCTTGTGATG AATCCAGATG ACACATTTGA GGTGTTAGTT GATCAAACAG 840 
TTGTAAACAA AGG AAGCCTC CTAGAGG ATG TGGTTCCTCC TATCAAACCT CCCAAAGAAA 900 
TTGAAGATCC CAATGATAAA AAACCTGAGG AATGGGATGA AAGAGCAAAA ATTCCTGATC 960 
CTTCTGCCGT CAAACCAGAA GACTGGGATG AAAGTGAACC TGCCCAAATA GAAGATTCAA 1020 

55 GTGTTGTTAA ACCTGCTGGC TGGCTTGATG ATGAACCAAA ATTTATCCCT GATCCTAATG 1080 

CTGAAAAACC TGATGACTGG AATGAAGACA CGGATGGAGA ATGGGAGGCA CCTCAGATTC 1 140 
TTAATCCAGC ATGTCGGATT GGGTGTGGTG AGTGGAAACC TCCCATG ATA G ATAACCCAA 1200 
AATACAAAGG AGTATGGAGA CCTCCACTGG TCGATAATCC TAACTATCAG GGAATCTGGA 1260 
GTCCTCGAAA AATTCCTAAT CC AGATTATT TCGAAGATGA TCATCCATTT CTTCTG ACTT 1320 

60 CTTTCAGTGC TCTTGGTTTA G AGCTTTGGT CTATGACCTC TGATATCTAC TTTG ATAATT 1380 

TTATTATCTG TTCGGAAAAG G AAGTAGCAG ATCACTGGGC TGCAG ATGGT TGGAGATGG A 1440 
AAATAATGAT AGCAAATGCT AATAAGCCTG GTGTATTAAA ACAGTTAATG GCAGCTGCTG 1500 
AAGGGCACCC ATGGCTTTGG TTGATTTATC TTGTGACAGC AGGAGTGCCA ATAGCATTAA 1560 
TTACTTCATT TTGTTGGCCA AG AAAAGTAA AGAAAAAACA TAAAGATACA GAGTATAAAA 1620 

65 AAACCG ACAT ATGTATACCA CAAACAAAAG GAGTACTAGA GCAAGAAGAA AAGGAAGAGA 1680 
AAGCAGCOCT GG AAAAACCA ATGGACCTGG AAGAGGAAAA AAAGCAAAAT GATGGTGAAA 1740 
TGCTTGAAAA AGAAGAGG AA AGTGAACCTG AGGAAAAGAG TGAAGAAGAA ATTGAAATCA 1800 
TAGAAGGGCA AGAAGAAAGT AATCAATCAA ATAAGTCTGG GTCAGAGGAT GAGATGAAAG 1860 
AAGCAGATGA GAGCACAGGA TCTGGAG ATG GGCCG ATAAA GTCAGTACGC AAAAGAAGAG 1920 

70 TACGAAAGGA CIAAACTAGA TTGAAATATT TTTAATTCCC GAGAGG ATGT TTGGCATTGT 1980 
AAAAATCAGC ATGCCAG ACC TG AACTTTAA TCAGTCTGCA CATCCTGTTT CTAATATCTA 2040 
GCAACATTAT ATTCTTTCAG ACATTTATTT TAGTCCTTCA TTTCCGAGGA AAAAGAAGCA 2100 
ACTTTG AAGT TACCTCATCT TTG AATTTAG AATAAAAGTG GCACATTACA TATCGGATCT 2160 
AAGAGATTAA TACCATTAGA AGTTACACAG TTTTAGTTGT TTGGAGATAG TTTTGGTTTG 2220 

75 TACAGAACAA AATAATATGT AGCAGCTTCA TTGCTATTGG AAAAATCAGT TATTGGAATT 2280 
TCCACTTAAA TGGCTATACA ACAATATAAC TGGTAGTTCT ATAATAAAAA TGAGCATATG 2340 
TTCTGTTGTG AAGAGCTAAA TGCAATAAAG TTTCTGTATG GTTGTTTGAT TCTATCAACA 2400 
ATTG AAAGTG TTGTATATG A CCCACATTTA CCTAGTTTGT GTCAAATTAT AGTTACAGTG 2460 
AGTTGTTTGC TTAAATTATA GATTCCTTTA AGGACATGCC TTGTTCATAA AATCACTGG A 2520 

363 



WO 02/30268 



TTATATTGCA GCATATTTTA CATTTGAATA CAAGGATAAT GGGTTTTATC AAAACAAAAT 2580 
GATGTACAGA TTTTTTTTCA AGTTTTTATA GTTGCTTTAT GCCAGAGTGG TTTACCCCAT 2640 
TCACAAAATT TCTTATGCAT ACATTGCTAT TG AAAATAAA ATTTAAATAT TTTTTCATCC 2700 
TGAAAAAAAA 



seq to MftiBopFAiPwtelnMqtfenw; 

Protein Accession #: NPJJ04353.1 

1 II 21 31 41 SI 
I I I I I I 

MHFQ AFWLCL GLLFISIN AE FMDDDVETED FEENSEEIDV NESELSSEIK YKTPQPIGEV 60 
YFAETFDSGR LAGWVLS KAK KDDMDEE1SI YDGRWEIEEL KENQVPGDRG LVLKSRAKHH 120 
AIS AVLAKPF IFADKPLTVQ YEVNFQDGID CGGAY1KLLA DTDDULENF YDKTSYIIMF 180 
GPDKCGEDYK LHFIFRHKHP KTGVFEEKHA KPPDVDLKKF FIDRKTHLYT LVMNPDDTFE 240 
VLVDQTWNK GSLLEDWPP IKPPKEIEDP NDKKPEEWDE RAKDPDPSAV KPEDWDESEP 300 
AQIEDSS WK PAGWLDDEPK FIPDPNAEKP DDWNEDTDGE WEAPQILNPA CRIGCGEWKP 360 
PMIDNPKYKG VWRPPLVDNP NYQGIWSPRK IPNPDYFEDD HPFLLTSFSA LGLELWSMTS 420 
DIYFDNFIIC SEKEVADHWA ADGWRWKIMI ANANKPGVLK QLMAAAEGHP WLWUYLVTA 480 
GVPIALITSF CWPRKVKKKH KDTEYKKTDI CIPQTKGVLE QEEKEEKAAL EKPMDLEEEK 540 
KQNDGEMLEK EEESEPEEKS EEEIEUEGQ EESNQSNKSG SEDEMKEADE STGSGDGPIK 600 
SVRKRRVRKD 



SEQ ID N0:161 PE29 DNA SEQUENCE 

Nucleic Acid Accession!: NMJXJ5932 

Coding sequence: 75-2216 (underlined sequences correspond lo start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCGGAGCGCG CGCTCCCAGC GAAAGCAGCA GGGCAGGGAT CTGCGTTGGA GGAAGGG ACT 60 
GCTCTGGTGC TAGAATGCTG TGCGTCGGAA GGCTGGGCGG CTTGGGAGCC AGAGCAGCAG 120 
CTCTGCCGCC CCGCCGGGCG GGCCGGGGAA GCCTCG AAGC CGGG ATCCGG GCCCGAAGGG 180 
TCAGCACCAG CTGGTCTCCC GTGGGCGCCG CCTTCAATGT CAAGCCCCAG GGCAGCCGCT 240 
TGGACCTGTT CGGCGAGCGG GCGCGTCTTT TTGGAGTTCC TG AGCTGAGT GCCCCAGAAG 300 
GATTTCATAT TGCACAAGAA AAAGCCTTGA GAAAGACAGA ATTGCTTGTG GACCGTGCAT 360 
GTTCCACCCC ACCTGGGCCC CAGACCGTGC TGATCTTCGA TGAGCTCTCG GATTCCTTAT 420 
GCAGAGTGGC CGACTTGGCT GATTTTGTGA AAATCGCTCA CCCTGAGCCA GCATTCAG AG 480 
AAGCTGCGGA AGAAGCTTGT AGAAGTATTG GCAGCATGGT AGAGAAGTTG AACACAAATG 540 
TGGATTTATA TCAAAGTTTG CAAAAATTAC TAGCTGATAA AAAACTTGTG GATTCCCTTG 600 
ATCCAGAAAC AAGGCX5AGTG GCTG AACTGT TTATGTTTGA TTTTGAAATT AGTGGAATCC 660 
ATCTAGACAA ACAAAAGCGT AAAAGAGCAG TGGACCTCAA TGTTAAAATC TTGGATTTGA 720 
GTAGTACATT TCTTATGGGA ACCAATTTTC CCAACAAG AT TG AGAAGCAT CTCTTACCAG 780 
AACACATTCG TCGTAACTTT ACATCTGCTG GGGATCATAT CATAATTGAT GGTCTCCACG 840 
CAGAATCACC AGATGACTTG GTGCGAG AAG CTGCTTATAA AATTTTTCTT TATCCCAATG 900 
CTGGTCAATT G AAATGTTTA G AAGAATTGC TCAGCAGCAG AGATCTTCTG GCAAAGTTGG 960 
TGGGGTATTC CACGTTTTCT CACAGGGCTC TCCAAGGAAC GATAGCTAAA AATCCAGAGA 1020 
CTGTCATGCA GTTCCTTGAA AAACTATCTG ACAAACTTTC TGAAAGAACT CTGAAAGATT 1080 
TTGAGATGAT ACGAGGGATG AAAATGAAAC TGAATGCTCA AAATTCCGAA GTAATGCCCT 1140 
GGGACCCCCC TTACTACAGT GGTGTGATTC GTGCAGAAAG GTATAATATT GAGCCCAGCC 1200 
TATATTGCCC GT mT C TCT CTTGGAGCAT GCATGGAAGG CCTGAATATT TTGCTTAACA 1260 
GACTGTTGGG GATTTCATTA TATGCAGAGC AGCCTGCAAA AGGAGAGGTG TGG AGCGAAG 1320 
ATGTCCG AAA ACTGGCTGTT GTTCATGAAT CTG AAGG ATT GTTGGGGTAC ATTTACTGTG 1380 
Ailll ITTCA GCGAGCAGAC AAACCACATC AGGATTGCCA TTTCACTATC CGTGGAGGCA 1440 
GACTAAAGGA AGATGGAGAC TATCAACTCC CACTTGTAGT TCTTATGCTG AATCTTCCCC 1500 
GTTCCTCAAG GAGTTCTCCA ACTTTGCTAA CTOCTGGCAT GATGGAAAAT CTTTTCCATG 1560 
AAATGGGACA TGCCATGCAT TCAATGCTAG GACGTACTCG TTACCAACAC GTCACTGGGA 1620 
CCAGGTGCCC TACTGATTTT GCTGAGGTTC CTTCTATTCT G ATGGAGTAC TTTGCAAATG 1680 
ATTATCGAGT AGTTAACCAA TTTGCCAG AC ATTATCAGAC TGGACAGCCA CTGCCAAAAA 1740 
ATATGGTGTC TCGTCTTTGT GAATCTAAAA AGGTTTGTGC TGCAGCTGAT ATGCAACTTC 1800 
AGGTCTTTTA TGCCACTCTG G ATCAAATCT ACCATGG G AA GCATCCCCTG AGGAATTCAA 1860 
CCACAG ACAT TCTCAAGG AA ACACAAGAGA AATTCTATGG CCTACCATAT GTTCCAAATA 1920 
CTGCCTGGCA GCTGCGATTC AGCCACCTCG TGGGGTATGG TGCTAGATAT TACTCTTACC 1980 
TCATGTCCAG AGCGGTCGCC TCCATGGTTT GGAAGGAGTG TTTTCTACAG GATCCTTTCA 2040 
ACAGGGCTGC CGGGGAGCGC TATCGCAGGG AGATGCTGGC CCACGGTGGA GGCAGGGAGC 2100 
CCATGCTCAT GGTTGAAGGT ATGCTTCAGA AGTGTCCTTC TGTTGATG AC TTCGTAAGTG 2160 
CCCTCGTTTC CGACTTGGAT CTGG ACTTCG AAACTTTCCT CATGGATTCT GAAIMAAG A 2220 
AACACTCTAC ACCTCTAATC AAGGTCATGT AGTAATGACT TTGTTATAAA TGCTACAGCT 2280 
GTG AG AGCTT GTTTCTGATT GTTTCATTGT TCGCTTCTGT AATTCTGAAA AACTTTAAAC 2340 
TGGTAG AACT TGG AATAAAT AATTTGTTTT AATTAAAAAA AAAAAAAAAA AA 



SEQ ID NO: 162 PEZ9 Protein spouence: 
Protein Accession ft NPJJ05923.1 

1 11 21 31 41 51 
I I I I I I 

MLCVGRUGGL GARAAALPPR RAGRGSLEAG IRARRVSTSW SPVGAAFNVK PQGSRLDLFG 60 
ERARLFGVPE LSAPEGFHIA QEKALRKTEL LVDRACSTPP GPQTVUFDE LSDSLCRVAD 120 
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LADFVKIAHP EPAFREAAEE ACRSIGTMVE KLNTNVDLYQ SLQKLLADKK LVDSLDPETR 180 
RVAELFMFDF HSGIHLDKQ KRKRAVDLN V KILDLSSTFL MGTNFPNKIE KHLLPEHIRR 240 
NFTSAGDHII IDGLHAESPD DLVREAAYKI FLYPNAGQLK CLEELLSSRD LLAKLVGYST 300 
FSHRALQGTI AKNPETVMQF LEKLSDKLSE RTLKDFEMK GMKMKLNAQN SEVMPWDPPY 360 
YSGVIRAERY NIEPSLYCPF FSLGACMEGL NUXNRLLGI SLYAEQPAXG EVWSEDVRKL 420 
AWHESEGLL GYIYCDFFQR ADKPHQDCHF TIRGGRLKED GDYQLPLVVL MLNLPRSSR5 480 
SPTLLTPGMM ENLFHEMGHA MHSMLGRTRY QHVTGTRCPT DFAEVPSILM EYFANDYRW 540 
NQFARHYQTG QPLPKNMVSR LCESKKVCAA ADMQLQVFYA TLDQIYHGKH PLRNSTTDIL 600 
KETQEKFYGL PYVPNTAWQL RFSHLVGYGA RYYS YLMSRA VASMVWKECF LQDPFNRAAG 660 
ERYRREMLAH GGGREPMLMV EGMLQKCPSV DDFVS ALVSD LDLDFETFLM DSE 



SEQ 10 NO:163 PEZ8 DNA SEQUENCE 

Nucleic Add Accession #: AF103S07 

Coding sequence: none (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

ACAGAAGAAA TAGCAAGTGC CGAGAAGCTG GCATCAGAAA AACAGAGGGG AGATTTGTGT 60 
GGCTGCAGCC GAGGGAGACC AGGAAGATCT GCATGGTGGG AAGGACCTGA TGATACAGAG 120 
GAATTACAAC ACATATACTT AGTGTTTCAA TGAACACCAA GATAAATAAG TGAAG AGCTA 180 
GTCCGCTGTG AGTCTCCTCA GTGACACAGG GCTGGATCAC CATCGACGGC ACTTTCTG AG 240 
TACTCAGTGC AGCAAAGAAA GACTACAGAC ATCTCAATGG CAGGGGTGAG AAATAAGAAA 300 
GGCTGCTGAC TTTACCATCT GAGGCCACAC ATCTGCTGAA ATGGAGATAA TTAACATCAC 360 
TAGAAACAGC AAGATGACAA TATAATGTCT AAGTAGTGAC ATGTTTTTGC ACATTTCCAG 420 
CCCCTTTAAA TATCCACACA CACAGGAAGC ACAAAAGGAA GCACAGAGAT CCCTGGGAGA 480 
AATGCCCGGC CGCCATCTTG GGTCATCG AT G AGCCTCGCC CTGTGCCTGG TCCCGCTTGT 540 
GAGGGAAGGA CATTAGAAAA TGAATTGATG TGTTCCTTAA AGGATGGGCA GGAAAACAGA 600 
TCCTGTTGTG G ATATTTATT TG AACGGGAT TACAGATTTG AAATGAAGTC ACAAAGTGAG 660 
CATTACCAAT G AG AGGAAAA CAGACG AGAA AATCTTGATG GCTTCACAAG ACATGCAACA 720 
AACAAAATGG AATACTGTGA TGACATGAGG CAGCCAAGCT GGGGAGGAGA TAACCACGGG 780 
GCAGAGGGTC AGGATTCTGG CCCTGCTGCC TAAACTGTGC GTTCATAACC AAATCATTTC 840 
ATATTTCTAA CCCTCAAAAC AAAGCTGTTG TAATATCTG A TCTCTACGGT TCCTTCTGGG 900 
COCAACATTC TCCATATATC CAGCCACACT CATTTTTAAT ATTTAGTTCC CAG ATCTGTA 960 
CTGTGACCTT TCTACACTGT AGAATAACAT TACTCATTTT GTTCAAAGAC CCTTCGTGTT 1020 
GCTGCCT AAT ATGTAGCTGA CTOTTTTTCC TAAGGAGTGT TCTGGCCCAG GGGATCTGTG 1080 
AACAGGCTGG G AAGCATCTC AAG ATCTTTC CAGGGTTATA CTTACTAGCA CACAGCATG A 1140 
TCATTACGGA GTGAATTATC TAATCAACAT CATCCTCAGT GTCTTTGCCC ATACTGAAAT 1200 
TCATTTCCCA CTTTTGTGCC CATTCTCAAG ACCTCAAAAT GTCATTCCAT TAATATCACA 1260 
GGATTAACTT TTTTTTTTAA CCTGGAAGAA TTCAATGTTA CATGCAGCTA TGGG AATTTA 1320 
ATTACATATT TTGTTTTCCA GTGCAAAGAT GACTAAGTCC TTTATCCCTC CCXTTTGTTT 1380 
GATTTTTTTT OCAGTATAAA GTTAAAATGC TTAGCCTTGT ACTGAGGCTG TATACAGCAC 1440 
AGCCTCTCCC CATCCCTCCA GCCTTATCTG TCATCACCAT CAACCCCTCC CATACCACCT 1500 
AAACAAAATC TAACTTGTAA TTCCTTGAAC ATGTCAGGAC ATACATTATT CCITCTGCCT 1560 
GAGAAGCTCT TCCTTGTCTC TTAAATCTAG AATGATGTAA AGTTTTGAAT AAGTTGACTA 1620 
TCTTACTTCA TGCAAAG AAG GGACACATAT GAGATTCATC ATCACATGAG ACAGCAAATA 1680 
CTAAAAGTGT AATTTGATTA TAAGAGTTTA GATAAATATA TGAAATGCAA GAGCCACAGA 1740 
GGGAATGTTT ATGGGGCACG TTTGTAAGCC TGGGATGTGA AGCAAAGGCA GGGAACCTCA 1800 
TAGTATCTTA TATAATATAC TTCATTTCTC TATCTCTATC ACAATATCCA ACAAGCTTTT 1860 
CACAG AATTC ATGCAGTGCA AATCCCCAAA GGTAACXHTr ATOCATTTCA TGGTGAGTGC 1920 
GCTTTAGAAT TTTGGCAAAT CATACTGGTC ACTTATCTCA ACTTTGAGAT GTGTTTGTCC 1980 
TTGTAGTTAA TTGAAAGAAA TAGGGCACTC TTGTGAGCCA CTTTAGGGTT CACTCCTGGC 2040 
AATAAAGAAT TTACAAAG AG CTACTCAGG A CCAGTTGTTA AGAGCTCTGT GTGTGTGTGT 2100 
GTGTGTGTGT GAGTGTACAT GCCAAAGTGT GCCTCTCTCT CTTG ACCCAT TATTTCAGAC 2160 
TTAAAACAAG CATGTTTTCA AATGGCACTA TG AGCTGCCA ATGATGTATC ACCACCATAT 2220 
CTCATTATTC TCCAGTAAAT GTGATAATAA TGTCATCTGT TAACATAAAA AAAGTTTGAC 2280 
TTCACAAAAG CAGCTGG AAA TGGACAAGCA CAATATGCAT AAATCTAACT CCTACCATCA 2340 
GCTACACACT GCTTGACATA TATTGTTAGA AGCACCTCGC ATTTGTGGGT TCTCTTAAGC 2400 
AAAATACTTG CATTAGGTCT CAGCTGGGGC TGTGCATCAG GCGGTTTGAG AAATATTCAA 2460 
TTCTCAGCAG AAGCCAGAAT TTG AATTCCC TCATCTTTTA GGAATCATTT ACCAGGTTTG 2520 
GAGAGGATTC AGACAGCTCA GGTGCTTTCA CTAATGTCTC TGAACTTCTG TCCCTCTTTG 2580 
TGTTCATGGA TAGTCCAATA AATAATGTTA TCTTTGAACT GATGCTCATA GGAGAG AATA 2640 
TAAGAACTCT GAGTGATATC AACATTAGGG ATTCAAAGAA ATATTAGATT TAAGCTCACA 2700 
CTGGTCAAAA GGAACCAAGA TACAAAGAAC TCTGAGCTGT CATCGTCCCC ATCTCTGTGA 2760 
GCCACAACCA ACAGCAGGAC CCAACGCATG TCTGAGATCC TTAAATCAAG GAAACCAGTG 2820 
TCATGAGTTG AATTCTCCTA TTATGGATGC TAGCTTCTGG CCATCTCTGG CTCTCCTCTT 2880 
GACACATATT AGCTTCTAGC CTTTGCTTCC ACG ACTTTTA TCTTTTCTCC AACACATCGC 2940 
TTACCAATCC TCTCTCTGCT CTGTTGCTTT GGACTTCCCC ACAAG AATTT CAACG ACTCT 3000 
CAAGTCTTTT CTTCCATCCC CACCACTAAC CTG AATGCCT AGACCCTTAT TTTTATTAAT 3060 
TTCCAATAGA TGCTGCCTAT GGGCTATATT GCTTTAGATG AACATTAGAT ATTTAAAGCT 3120 
CAAGAGGTTC AAAATCCAAC TCATTATCTT CTCl'l ICTTT CACCTCCCTG CTCCTCTC(X 3180 
TATATTACTG ATTGCACTGA ACAGCATGGT CCCCAATGTA GCCATGCAAA TGAGAAACCC 3240 
AGTGGCTCCT TGTGGTACAT GCATGCAAGA CTGCTGAAGC C AGAAGGATG ACTGATTACG 3300 
CCTCATGGGT GGAGGGGACC ACTCCTGGGC CTTCGTGATT GTCAGG AGCA AGACCTGAGA 3360 
TGCTCCCTGC CTTCAGTGTC CTCTGCATCT CCCCTTTCTA ATGAAGATCC ATAGAATTTG 3420 
CTACATTTGA GAATTCCAAT TAGG AACTCA CATGTTTTAT CTGCCCTATC AATTTTTTAA 3480 
ACTTGCTG AA AATTAAGTTT TTTCAAAATC TGTCCTTGTA AATTACTTTT TCTTACAGTG 3540 
TCTTGGCATA CTATATCAAC TTTG ATTCTT TGTTACAACT TTTCTTACTC TTTTATCACC 3600 
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AAAGTGGCTT TTATTCTCTT TATTATTATT Al 11 TCI 1 1 1 ACTACTATAT TACGTTGTTA 3660 
TTATTTTGTT CTCTATAGTA TCAATTTATT TGATTTAGTT TCAATTTATT TTTATTGCTG 3720 
ACTTTTAAAA TAAGTGATTC GGGGGGTGGG AGAACAGGGG AGGGAGAGCA TTAGGACAAA 3780 
TACCTAATGC ATGTGGGACT TAA AACCTAG ATGATGGGTT GATAGGTGCA GCAAACCACT 3840 
ATGGCACACG TATACCTGTG TAACAAACCT ACACATTCTG CACATGTATC CCAGAACGTA 3900 
AAGTAAAATT TA A AAAAAAG TG A 



PEZ8 Protein secuence: 
Protein Accession*: none 

SEQ ID KO:164 PEZ6 DNA SEQUENCE 

Nucleic Acid Accession*: AB028945 

Coding sequence: 1-3765 (underlined sequences correspond to start and stop codons) 

1 II 21 31 41 51 
I I I I I I 

ATGATGATGA ACGTCCCCGG CGGAGGAGCG GCCGCGGTGA TGATGAOGGG CTACAATAAT 60 
GGTCGCTGTC CCCGGAATTC TCTCTACAGT GACTGCATTA TTCAGGAGAA GACGGTGGTC 120 
CTGCAG AAAA AAG ACAATG A GGGCTTTGGA TTCGTGCTTC GAGGGGCCAA AGCTG ACACA 180 
CCCATTG AAG AATTCACACC AACACCGGCT TTCCCAGCCC TACAGTACCT GGAGTCCGTG 240 
GATGAAGGTG GGGTGGCGTG GCAAGCCGGA CTAAGG ACCG GGGACT1C1T GATTGAGGTT 300 
AACAATGAG A ATGTTGTCAA AGTCGGCCAC AGGCAGGTGG TGAACATGAT CCGGCAGGGA 360 
GGGAATCACC TGGTCCTTAA GGTGGTCACG GTGACCAGGA ATCTGGACCC CGACGACACC 420 
GCCAGGAAG A AAGCTCCCCC GCCTCCAAAG CGGGCACCGA CCACAGCCCT CACCCTGCGC 480 
TCCAAGTCCA TGACCTCGGA GCTGGAGGAG CTCGTGGATA AAGATAAACC CGAGGAGATA 540 
GTCCCGGCCT CCAAGCCCTC CCGCGCTGCT GAGAACATGG CTGTGGAACC GAGGGTGGCG 600 
ACCATCAAGC AGCGGCCCAG CAGCCGGTGC TTCCCGGCGG GCTCAG ACAT G AACTCTGTG 660 
TACGAACGCC AAGGAATCGC CGTGATGACG CCCACTGTTC CTGGGAGCCC AAAAGCCCCG 720 
TTTCTGGGCA TCCCTCGAGG TACGATGCGA AGGCAG AAAT CAATAGACAG CAGAATCTTT 780 
CTATCAGGAA TAACAGAGGA AGAGCGGCAG TTTCTGGCTC CTCCAATGCT GAAGTTCACC 840 
AG AAGCCTGT CCATGCCGGA CACCTCTG AG G ACATCCCCC CTCCACCGCA GTCTGTGCCC 900 
CCGTCCCCAC CACCACCTTC CCCAACCACT TACAACTGCC CCAAGTCCCC AACTCCAAGA 960 
GTCTACGGGA CGATTAAGCC TGCGTTCAAT CAGAATTCTG CCGCCAAGGT GTCOCCGGCC 1020 
ACCAGGTCCG ACACCGTGGC CACCATGATG AGGGAGAAGG GGATGTACTT CAGGAGAGAG 1080 
CTGGACCGCT ACTCCTTGG A CTCTGAAGAC CTCTACAGTC GGAATGCCGG CCCGCAAGCC 1140 
AACTTCCX3CA ACAAG AGAGG CCAGATGCCA GAAAACCCAT ACTCAGAGGT GGGGAAGATC 1200 
GCCAGCAAAG CCGTCTACGT CCCCGCCAAG CCCGCCAGGC GGAAGGGGAT GCTGGTGAAG 1 260 
CAGTCCAACG TGGAGGACAG CCCCGAGAAG ACGTGCTCCA TCCCTATCCC GACCATCATC 1320 
GTGAAGGAGC CGTCCACCAG CAGCAGCGGC AAGAGCAGCC AGGGCAGCAG CATGGAGATC 1380 
GACCCCCAGG CCCCGGAGCC ACCG AG CC AG CTGCGGCCTG ACGAAAGCCT GACCGTCAGC 1440 
AGCCOCTTTG CCGCCGCCAT CGCCGGAGCC GTCCGCGACC GTGAGAAGCG GCTGGAAGCC 1500 
AGGAGGAACT CCCCGGCCTT CCTCTCCACA G ACCTGGGGG ATGAGG ATGT GGGCCTGGGG 1560 
CCACCCGCCC CCAGGACGCG GCCCTCCATG TTCCCCGAGG AGGGGGATTT TGCTGACGAG 1620 
GACAGCGCTG AGCAGCTGTC ATCCCCCATG CCGAGTGCCA CGCCCAGGGA GCCCGAAAAC 1680 
CATTTCGTGG GTGGCGCCG A GGCCAGTGCT CCGGGTGAGG CTGGGAGGCC GCTGAATTCC 1740 
ACGTCCAAAG CCCAGGGGCC CG AGAGCAGC CCAGCAGTGC CCTCCGCGAG CAGCGGCACA 1800 
GOCGGCGCCG GG AATTATGTCCACCCACTC ACAGGGCGGC TGCTTGATCC CAGCTCCCCG 1860 
CTGGCCCTGG CACTCTCCGC AAGGGACCGA GCCATGAAGG AGTCTCAACA GGGACCCAAA 1920 
GGGGAGGCCC CCAAGGCCGA CCTCAACAAA CCTCTTTACA TTGATACCAA AATGCGGCCC 1980 
AGCCTGGATG CCGGCTTCCC TACGGTCACC AGGCAGAACA CCCGGGGACC CCTGAGGCGG 2040 
CAGGAGAGGG AGAACAAGTA CGAGACCGAC CTGGGCCGAG ACCGGAAAGG CGATGACAAG 2100 
AAGAACATGC TGATCGACAT CATGGACACG TCCCAGCAGA AGTCGGCTGG CCTGCTGATG 2160 
GTGCACACCG TGGACGOCAC TAAGCTGGAC AACGCCCTGC AGGAAGAGGA CGAGAAGGCA 2220 
GAGGTGGAGA TGAAGCCAGA CAGCTCGCCG TCCGAGGTGC CAGAAGGTGT TTCCG AAACC 2280 
GAAGGTGCTT TACAGATCTC CGCTGCCCCC GAGCOCACCA CCGTGCCCGG CAGAACCATC 2340 
GTCGCGGTGG GCTCCATGGA AG AGGCGGTG ATTTTGCCAT TCCGCATCCC TCCTCCCCCT 2400 
CTGGCATCCG TGGACTTGGA TGAGGATTTT ATTTTTACAG AGCCATTGCC TCCTCCCCTG 2460 
GAATTTGCAA ATAGTTTTGA TATCCCCGAT GACCGGGCAG CTTCTGTCCC GGCTCTCTCA 2520 
GACTTAGTGA AGCAGAAGAA AAGCGACACC CCTCAGTCCC CTTCGTTGAA CTCCAGCCAA 2580 
CCAACCAACT CTGCAG ACAG CAAGAAGCCA GCCAGTCTTT CAAACTGTCT GCCTGCCTCA 2640 
TTCCTGCCAC CCCCTG AAAG CTTTGACGCC GTCGCCG ACT CTGGGATCGA GGAGGTGGAC 2700 
AGCOGGAGTA GCAGCG ACCA CCACCTCG AG ACGACCAGCA CTATCTCCAC CGTGTCTAGC 2760 
ATCTCCACCC TGTCTTCCGA AGGTGG AG AG AATGTGGACA CCTGCACAGT CTATGCAG AT 2820 
GGGCAAGCAT TTATGGTTGA CA AACCCCCA GTACCTCCTA AGCCAAAAAT GAAGCCCATC 2880 
ATTCACAAAA GCAATGCACT TTATCA AG AC GCGCTCGTGG AAGAAG ATGT AGATAGCTTT 2940 
GTTATCCCCC OGCCCGCTOC CCCGCCCCCG CCGGGCAGTG CCCAGCCTGG GATGGCCAAG 3000 
GTTCTCCAGC CAAGGACCTC CAAGTTGTGG GGCGACGTCA CAGAGATCAA AAGCCCGATT 3060 
CTCTCAGGCC CAAAGGCAAA CGTTATTAGT GAATTGAACT CTATCCTACA GCAAATGAAC 3120 
CG AG AG AAAT TGGCAAAGCC GGGGGAAGGA CTGG ATTCAC CAATGGGAGC CAAGTCCGCC 3180 
AGCCTCGCTC CAAGAAGCCC GGAGATCATG AGCACCATCT CAGGTACACG GAGCACGACG 3240 
GTCACCTTCA CTGTTCGCCC CGGCACCTCC CAGCCCATCA CCCTGCAGAG CCGGCCCCCC 3300 
GACTATGAAA GCAGGACCTC AGG AACAAGA CGTGCCCCAA GCCCTGTGGT CTCGCCAACA 3360 
GAGATGAACA AAGAG ACCCT GCCCGCCCCC CTGTCTGCTG CCACCGCCTC TCCTTCTCCC 3420 
GCTCTCTCAG ATG 1U1I I AG CCTTCCAAGC CAGCCCCCTT CTGGGGATCT ATTTGGCTTG 3480 
AACCCAGCGG G ACGCAGTAG GTCGCCATCC CCCTCGATAC TGCAACAGCC AATCTCAAAT 3540 
AAGCCTTTTA CAACTAAACC TGTCCACCTG TGGACTAAAC CAGATGTGGC CGATTGGCTG 3600 
GAAAGTCTAA ACTTGGGTG A ACATAAAG AG GCCTTCATGG ACAATG AG AT CG ATGGCAGT 3660 
CACTTACCAA ACCTGCAG AA GG AGGACCTC ATCGATCTTG GGGTAACTCG AGTCGGGCAC 3720 
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AOAATGAACA TAGAAAGGCC TTTGAAACAG CTGCTGGACA GATAAGGACG GCTGCTCTCC 3780 
ACCTCGCAG A CTGCTCTTGT TATAAGTAGA GATGGGCTCG TGCTG AAACA TCTGAATGCC 3840 
AAGCGAAGTC TGTG AGCATC AACCCCACTC CATGGGTTTG TCTCCTGGTA CCCAAAGAAA 3900 
TACTGAGTTG TGTCCACAAC ATGGCTGGGT CTTCAGACCC CTGGCTCACC ATGTGGGTGT 3960 
CTTGGGCAGT TTCTATCACA CATGGG ACAA GGGG AGGGAG TTTTTCTAAC ATGGAAAAAG 4020 
ATTCCCAGCC TGCCGCCCAG CATGCAGGTG GCCTCGCTTT GCCGGGTCCG AGAGGCTCCC 4080 
CGTCAATTTT GCACGGGATC CTAGCTCTTG TAGGCAGACA CCAGTGCACT CTAGATACCT 4140 
CCTGAGACCT CCGTCCTCTG CTTTCOGGGC AGCTCTCACC ACCCCAGGCC CCGGCATGAG 4200 
GCCnrCCTC AGTCCTGTGG CCTCTCAGAG GACACCTGAT GCrCACCTGC CCXrrCTTTCT 4260 
CCTGCACTTG GCTTGCAGTG AGATGCTCCC AGATGCATTT GTCCAGTGCC CCATCATGGG 4320 
CCTGAAAGGC AGAGAAACTT TTTCCTACAC AGATTCTTTT CXXCATCTCC TCCTGTGGTT 4380 
TGCATOCATO GCTCTTTGGC CATGAGGTTC CTGGCAGTGC TGGGAGTTTG GATGGGATCG 4440 
TGCCCAGCTT TGCTTAGCTT TCTTTATTTC TGCAAATCTG TTAGCATAAT TCCAAGGTGG 4500 
CCAAGCAGAT GTCACATGGA GTTAGTCAAA GCACAAAGTC ACGATTCCAC AATGGAGGGG 4560 
AGACCTGGCC AAGGGAGCCA GCCAGCGTGC AACTGCCCAA GCTCCAGGTC TCCAGGACAA 4620 
GAGCAGTTGT CTGCCATGAG CACCCATCCA GGATGGAGAA TAAGGGCTTC TCTGCCTCTC 4680 
AGAATTCTTT TTAATTGAAG ATGTCTTGAG CTCTGCAAAG ATCAGAGCAG GTGAGCATCC 4740 
ACTTTGACAT GAAGG ACAAG AAGACGCATG GCTCATGGCG GGCACATGCG GCTGCCAGTG 4800 
AGACAGCGTC TCCTCTGGGA GCTGGGCGGG CACAGCATCC TCAGTTCTGT GCCCAGCCAA 4860 
GGGTGAGCAT CTCTGCTGAG ACAGTCCTTT TGCTCTCGGA GGCCAGGG AA GATGGTACTT 4920 
AGAGGCTTTT CCCCTATCGC TCTGGGTGTC TAGGAATCCC ACCAGCTTGT CTTAACAGTA 4980 
CAACAGCTTC TTTGAGG ACC CAGTGGGTAT GGAGTATAGA CAGAACCCAG GGTTGAGAAC 5040 
AGAAGGTGGG OGGCAGGATC AGAGTGAAAG CAGAGGCGTG AGGAGAGGAA AGCAGGGAGG 5100 
TCTCCTGGGC TGCCAGGTCA GCCTCTCTGG CAAGGCTTTC TTGAGGCCCG CCCCTTTCTT 5160 
TCCCCGGAGT OCCTCCACCC CATAACAATA CCTCGAATTT CCAAAAG AGG TCACCAGATG 5220 
CACATGGGCC GCAAA ACACA CAGTCAGGCT TCCAGCACAT TCTCCCCCAT TTGGAGGATA 5280 
CTCGAATGTC AGGTTTTTGG TTTTATTATT ATTTCAGAAC TAGCTCAGCC CATCTCTAAT 5340 
TATAAAACAT CKjTTTTGTTT TTTTTTTTTC (nTTTTTTCT TGATTAGGTC TGG AACAGCT 5400 
CTAGAATG AA CACATAAAAT TTAGCAATTT AAAATCTTTC TTTACTGCAA GTTTAAATAG 5460 
TTGTACAGAT AGTTTATAAG CACAATATTT TAAGAAAAAA AAGTGGCTGG TCTACTAGGC 5520 
AGCCTTTGTG CCACTTCAGT GCTAGAAAGT TAAAGAAAAA AAAACTTTTG TGATTTAATA 5580 
ATACTATTTC TGTGG AATAA TTATAAAAGT ATG ACCTTTT TAAATCAACC TTATTTGGAT 5640 
GCATCIGAAC CAGCAGAGCT GTGTTATATT TTCTATCTTT GCTAGAACTT CGTCATTGAA 5700 
GGACAATTTC TTCAAAGTGG TTACAATTCA TAATGCAGCA GTTTCTCCAA AAACAAAAAC 5760 
AAAACACACA CCACACACAC GCGC1 ill CC AGTCACACAC CCCTGATGTT GGAACCAAGT 5820 
TTTTGGACCr TCTGTTCCAA AACC11T1GC AGGTCAATCT TTGTATTTGA AATGATCCAA 5880 
TCCAACTTG A AGTCAATTGA ATATTAAGGC GCTTTACTTC CGTGTGCTTT CAGTTTTTCC 5940 
ATCATGAG AT GAATGAGCAT TACTCTAGAT AAATTTCAAG ACAGGATACT ACAGGTGGCC 6000 
TGCTGAGGCT GCCCCATATT TTAGAAAATG TAAAAATGGT GGTTTGGCCA TTAATTTGTC 6060 
TTCCATTTGA TGATACCGCA AAATTCCGTG AGTCCATTCC TTTGGCATGG CACTTTCCCT 6120 
GGGCCTACAG TTGGTATTAC CTCTGTGCTC AGTGCCAGGC AAAACACTAG CTCAAAGGAG 6180 
AGTCAAGGAA ACCGCTGGCA GACGATAACC AGTCGAAACT CGTGACTTCG GTTTGTTGAA 6240 
CTTTGGCAGC CAGTTGGTGA GGGCCAGATG TTATTCCCTT TCTTAAAGAT ACTCCAAGCC 6300 
ACATGCCACT AACCACAAGC AAGCTGGCTG CAAGACTAAA GAGCTGATAA CATAGTTTAT 6360 
TTTTACACTG TCTTATTATA GAG AAGTAAT AGACCTATCA GAACCTGCAC TGACCAACAA 6420 
ATAAACACAT GTTGCCAAGA TGAATCGGTC TCTATCTCTA TCTGCTTATT TTGGTACTGA 6480 
AAGCAATAGT TCCTCATTCA AATCACCACC CACTGTTCTC CCOCTTTGGG ACATGTTAGG 6540 
ACGAGGCCCT ATTCCATGCC OCTCTTTAAT GGTGGAACAA ATGTTAAACT GCTCATCTAA 6600 
AGATCATGTT GATATTATTC CAGGTTTTAA GATCAACTTT TGTTACATAC TGTAATTTAA 6660 
ATAAACTGCA TTTACATGCC TA GT nCTGT AATATTCTGT ATACAAAACC CAAATCTCTC 6720 
AAAATGTAAA TTATGTATAC CTGCCAAGAT ACCTTTTCCA GGGTGTCTGC GCACATTTTA 6780 
AGTTAATTCA CATAATATAA AAATTACTCA ATGTGACTGT TGATTTGCTG AACTTTACAT 6840 
ATCACAAAGT GAATTATTTG TGATACTTTA GTTAATAAAA TGGTAAATTTTTTTCTCAGT 6900 
TATTGAACAA GCAAGCATTA TCCAGTTGAT CTGGCAATGA CTTTTTGTGT GTGGGCCACA 6960 
ATATTGATTT TCCCATTAAC AATTTTTTTT TGTTTTTTAA ATACTAATAT GTTTCACACT 7020 
ATAGTTTGTG TAACAACACG TGTTCGCATT ATCTATGTTG CPGTTACTTT TGTGCTTTTA 7080 
TTCTTTTTAG ACTTTATAAA AAAAAAAAAA AGCTCCTGTA ATTTGCACTT TCTCCCAATC 7140 
CTTAAATCTC TTGTATGGCA ACCAAAATTA CTGTAAAAAA ATAAATATAC TATTGCACTA 7200 
AGGTTGTGGT TCTG ATTGCA AACAAACAGT GAACACTGTC TGAATTAAAC AAAAAGCTGC 7260 
CCGACTTGCA ATCTAATGTA GATTATCTCA GGCATTGTGG CCAGCTCTGC CTCTCTAAAA 7320 
CTGACCAGAA AAATCTCTCT CATCGAGTAA ACAGGCTCCT GTCACTGAGC TAATCTGCCT 7380 
TGGTTOCATT TGCTTATTCT CAATTTATCA ATGGATACGT GCATGTTATT TCAGAATTAT 7440 
GCAAAACGTC AAAATCTGCT TCTGTGACCG CTGCTATAGG CGTGGAGCTG AGGCTCGGCT 7500 
TTTCCTTTTG TTCTGGGTGG AAGCAGCGGT GCCGCGGAGG GCCAGCCAG A TCCGG ACCCT 7560 
TCCCTTAGGG TCCAGTCTCC CCACACCCCA GCAGGGTGTC TTCTAGCCAT AAGGCCAAGG 7620 
G AGTGGCAGA ACTGGGCCGC CTCTCTGGTT GACAAGCAAA CCACATGCTA AGGCTTGGAG 7680 
CAAG AGAG AA TTTGTGTCTA TTGGCAAAG A ACTAAGCCAG G AAG ACATGG GOCATCCCTC 7740 
CGCTTTAGGG AAGCATATTT TAAACCTAAA CGTTG AACTT CTTCTTTGGC CTCACCAGTG 7800 
AAAACTTGTT GTCTTTAGTT CCTAAAGTTT CTTCTACTTT GGCACATTCC CCAGTTGAGC 7860 
AGCAGCCTCT ATGCTTCCAC GTTCAGGA AA AATTCCAGTC CTCATATCTT TTGTAGTTCA 7920 
OCCTCAAGCT CTCCCGCTTC ACC ATCCAAT AGTTTCTCCC AAACCTTGGC ACCCCCCTAG 7980 
ACTTTGCTTC CAATGGTTTC TTCCAGACCA CTTTTCCTAG ATG AATATAT TCGTT TACCT 8040 
TACTAGGAAA ATTATTGOAA GATTTTTTCT TTTACTTGAA ATTGGAGGCA TTTTAATAAC 8100 
TGGCGAACTG GAATGTGTTT CTGTATTTGT AG ACAACCAT GTACCCATGC AAGTAGGTGA 8160 
ACATTCCACA GTGGCTGGGT GACCACAGCA GCTGCATGCA GACAGGACTG CCCGTGCTTT 8220 
GTGGGGAATC AG AGAATTTC CAAACTTGTT TCTCAGACTT CCGCAGATCT CATCACTTTG 8280 
ATTTCTAATC C ATGCTGTAT TGGTG ATTTT GTTTATCGTT CCTGTAACTT GTTCTACATT 8340 
OCACAGTCTT TACCGTTTTA TGTTCAAAAT TACAACAATC CCTGTCCATT GATTCCACTC 8400 
TGGAACTCTT TGTTCATGCC AATTTTG AAA TTTTAATACG AGCCTTCAAA TAAACACAG A 8460 
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AAAGAAAAAA AAAAAAAAAA AAAAAAAA 



§EQ [D HQ:1g? PEfl Projeln sequent 
Protein Accession #: BAA82974.1 



I 11 21 31 41 51 
I I I I I I 

MMMN VPGGGA AAVMMTGYNN GRCPRNSLYS DCIIEEKTW LQKKDNEGFG FVLRGAKADT 60 
PIEEFTPTPA FPALQYLESV DEGGVAWQAG LRTGDFUDEV NNENWKVGH RQWNMIRQG 120 
GNHLVLKVVT VTRNLDPDDT ARKKAPPPPK RAPTTALTLR SKSMTSELEE LVDKDKPEEI 180 
VPASKPSRAA ENMAVEPRVA TIKQRPSSRC FPAGSDMNSV YERQGIAVMT PTVPGSPKAP 240 
FLGIPRGTMR RQKSIDSRIF LSGITEEERQ FLAPPMLKFT RSLSMPDTSE DIPPPPQSVP 300 
PSPPPPSPTT YNCPKSPTPR VYGTIKPAFN QNS AAKVSPA TRSDTVATMM REKGMYFRRE 360 
LDRYSLDSED LYSRNAGPQA NFRNKRGQMP ENPYSEVGKI ASKAVYVPAK PARRKGMLVK 420 
QSNVEDSPEK TCSIPIPTH VKEPSTSSSG KSSQGSSMEK DPQAPEPPSQ LRPDESLTVS 480 
SPFAAAIAGA VRDREKRLEA RRNSPAFLST DLGDEDVGLG PPAPRTRPSM FPEEGDFADE 540 
DSAEQLSSPM PSATPREPEN HFVGGAEASA PGEAGRPLNS TSKAQGPESS PAVPSASSGT 600 
AGPGNYVHPL TGRIXDPSSP LALALSARDR AMKESQQGPK GEAPKADLNK FLYIDTKMRP 660 
SLDAGFPTVT RQNTRGPLRR QETENKYETD LGRDRKGDDK KNMUDIMDT SQQKSAGLLM 720 
VHTVDATKLD NALQEEDEKA EVEMKPDSSP SEVPEGVSET EGALQIS AAP EPTTVPGRTI 780 
VAVGSMEEAV ILPFRIPPPP LAS VDLDEDF IFTEPLPPPL EFANSFDIPD DRAASVPALS 840 
DLVKQiCKSDT PQSPSLNSSQ PTNSADSKKP ASLSNCLPAS FLPPPESFDA VADSGIEEVD 900 
SRSSSDHHLE TTSTISTVSS ISTLSSEGGE NVDTCTVY AD GQAFMVDKPP VPPKPKMKPI 960 
IHKSNALYQD ALVEEDVDSF VJPPPAPPPP PGSAQPGMAK VLQPRTSKLW GDVTEIKSPI 1020 
LSGPKANVIS ELNSIJjQQMN REKLAKPGEG LDSPMGAKSA SLAPRSPEIM STISGTRSTT 1080 
VTFTVRPGTS QPITLQSRPP DYESRTSGTR RAPSPWSPT EMNKETLPAP LSAATASPSP 1140 
ALSDVFSLPS QPPSGDLFGL NPAGRSRSPS PSILQQPISN KPFTTKPVHL WTKPDVADWL 1200 
ESLNLGEHKE AFMDNEHX3S HLPNLQKEDL IDLGVTRVGH RMNIERALKQ LLDR 



SEQ ID N0:166 PEZ4 DNA SEQUENCE 

Nucleic Acid Accession*: NM.000024 

Coding sequence: 220-1461 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I ! I I I I 

ACTGCGAAGC GGCTTCTTCA GAGCACGGGC TGGAACTGGC AGGCACCGCG AGCCCCTAGC 60 
ACCCGACAAG CTGAGTGTGC AGGACG AGTC CCCACCACAC CCACACCACA GCCGCTGAAT 120 
GAGGCTTCCA GGCGTCCGCT CGCGGCCCGC AGAGCCCCGC CGTGGGTCCG CCCGCTGAGG 180 
CGCCCCCAGC CAGTGCGCTT ACCTGCCAGA CTGCGCGCCAJTQGGGCAACC CGGGAACGGC 240 
AGCGCCTTCT TGCTGGCACC CAATAGAAGC CATGGGCCGG ACCACG ACGT CACGCAGCAA 300 
AGGGACGAGG TGTGGGTGGT GGGCATGGGC ATCGTCATGT CTCTCATCGT CCTGGCCATC 360 
GTGTTTGGCA ATGTGCTGGT CATCACAGCC ATTGCCAAGT TCG AGCGTCT GCAGACGGTC 420 
ACCAACTACT TCATCACTTC ACTGGCCTGT GCTGATCTGG TCATGGGCCT GGCAGTGGTG 480 
CCCTTTGGGG CCGCCCATAT TCTTATGAAA ATGTGGACTT TTGGCAACTT CTGGTGCGAG 540 
TTTTGGACTT CCATTGATGT GCTGTGCGTC ACGGCCAGCA TTGAGACCCT GTGCGTGATC 600 
GCAGTGGATC GCTACTTTGC CATTACTTCA CCTTTCAAGT ACCAG AGCCT GCTGACCAAG 660 
AATAAGGCCC GGGTGATCAT TCTGATGGTG TGG ATTGTGT CAGGCCTTAC CTCCTTCTTG 720 
CCCATTCAGA TGCACTGGTA CCGGGCCACC CACCAGGAAG CCATCAACTG CTATGCCAAT 780 
GAGACCTGCT GTGACTTCTT CACGAACCAA GCCTATGCCA TTGCCTCTTC CATCGTGTCC 840 
TTCTAOGTTC CCCTGGTGAT CATGGTCTTC GTCTACTOCA GGGTCTTTCA GG AGGCCAAA 900 
AGGCAGCTCC AGAAGATTGA CAAATCTGAG GGCCGCTTCC ATGTCCAGAA CCTTAGCCAG 960 
GTGGAGCAGG ATGGGGGGAC GGGGCATGGA CTCCGCAG AT CTT0CAAGTT CTGCTTGAAG 1020 
GAGCACAAAG CCCTCAAGAC GTTAGGCATC ATCATGGGCA CTTTCACCCT CTGCTGGCTG 1080 
CXXTTCTTCA TCGTTAACAT TGTGCATGTG ATCCAGGATA ACCTCATCCG TAAGG AAGTT 1140 
TACATCCTCC TAAATTGG AT AGGCTATGTC AATTCTGGTT TCAATCCCCT TATCTACTGC 1200 
CGGAGCCCAG ATTTCAGGAT TGCCTTCCAG GAGCTTCTGT GCCTGCGCAG GTCrTCTTTG 1260 
AAGGCCTATG GGAATGGCTA CTCCAGCAAC GGCAACACAG GGGAGCAGAG TGGATATCAC 1320 
GTGG AACAGG AG AAAG AAAA TAAACTGCTG TGTGAAGACC TCCCAGGCAC GGAAGACTTT 1380 
GTGGGCCATC AAGOTACTGT GCCTAG CGAT AACATTGATT CACAAGGGAG GAATTGTAGT 1440 
ACAAATG ACT CACTGCTGTA_A_AGCAGTTTT TCTACTTTTA AAGACCCCCC CCCCXTCCAAC 1500 
AGAACACTAA ACAGACTATT TAACTTGAGG GTAATAAACT TAGAATAAAA TTGTAAAAAT 1560 
TGTATAGAGA TATGCAGAAG GAAGGGCATC CTTCTGCCTT TTTTATTTTT TTAAGCTGTA 1620 
AAAAG AGAGA AAACTTATTT G AGTGATTAT TTGTTATTTG TACAGTTCAG TTCCTCTTTG 1680 
CATGGAATTT GTAAGTTTAT GTCTA AAGAG CTTTAGTCCT AGAGGACCTG AGTCTGCTAT 1740 
ATTTTCATGA CTTTTCCATG TATCTAC CTC A CTATTCAAG TATTAGGGGT AATATATTGC 1800 
TGCTGGTAAT TTGTATCTGA AGGAGATTTT CCTTCCTACA CCCTTGGACT TGAGGATTTT 1860 
GAGTATCTCG GACCTTTCAG CTGTGAACAT GGACTCTTCC CCCACTCCTC TTATTTGCTC 1920 
ACACGGGGTA TTTTAGGCAG GGATTTGAGG AGCAGCTTCA GTTGTTTTCC CGAGCAAAGG 1980 
TCTAAAGTTT ACAGTAAATA AAATGTTTGA CCATG 



SEQ ID NQ:167 PEZ4 Protein sequence: 
Protein Accession #: NPJW0015.1 
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MGQPGNGSAF LLAPNRSHAP DHDVTQQRDE VWWGMGIVM SUVLAIVPG NVLVITAIAK 60 
FERLQTVTNY FITSLACADL VMGLAVVPFG AAHILMKMWT PGNFWCEFWTSIDVLCVTAS 120 
ETLCVIAVD RVTAITSPFK YQSLLTKNKA RVIILMVWIV SGLTSFLPIQ MHWYRATHQE 180 
AINCYANETC CDFFTNQAYA IASSIVSFYV PLVIMVFVYS RVPQEAKRQL QKIDKSEGRF 240 
HVQNLSQVEQ DGRTGHGLRR SS KFCLKEHK ALKTLGIIMG TFTLCWLPFF IVNIVHVIQD 300 
NURKEVYIL LNWIGYVNSG FNPLIYCRSP DFRIAFQELL CLRRSSLKAY GNGYSSNGNT 360 
GEQSGYHVEQ EKENKLLCED LPGTEDFVGH QGTVPSDNED SQGRNCSTND SIX 



SEQ ID NO:168 PEZ1 ONA SEQUENCE 

Nucleic Acid Accession* NM.004457 

Coding sequence: 143-2305 (underlined sequences correspond to start and stop codorts) 



1 11 21 31 41 51 
I I I I I I 

GAATTCGTTG TTGGG AAGG A CTGGGGAAAC AGCTGTAACA TTTGCCACCC TCAGAAGCTG 60 
CTGGTCCTGT GTCACACCAC CTTAGCCTCT TGATCGAGGA AGATTCTCGC TGAAGTCTGT 120 
TAATTCTACT TTTTGAGTAC TTATGAATAA CCACGTGTCT TCAAAACCAT CTACCATGAA 180 
GCTAAAACAT ACCATCAACC CTATTC1 1 11 ATATTTTATA CATTTTCTAA TATCACTTTA 240 
TACTATTTTA ACATACATTC CGTTTTATTT TTTCTCCG AG TCAAG ACAAG AAAAATCAAA 300 
CCGAATTAAA GCAAAGCCTG TAAATTCAAA ACCTG ATTCT GCATACAG AT CTGTTAATAG 360 
TTTGGATGGT TTGGCTTCAG TATTATACCC TGGATGTG AT ACTTTAGATA AAGTTTTTAC 420 
ATATGCAAAA AACAAATTTA AG AACAAAAG ACTCTTGGGA ACACGTGAAG TTTTAAATGA 480 
GGAAGATGAA GTACAACCAA ATGGAAAAAT TTTTAAAAAG GTTATTCTTG GACAGTATA A 540 
TTGGCTTTCC TATGAAGATG TCTTTGTTCG AGOCTTTAAT TTTGGAAATG GATTACAGAT 600 
GTTGGGTCAG AAACCAAAGA CCAACATCGC CATCTTCTGT GAGACCAGGG CCGAGTGGAT 660 
GATAGCTGCA CAGGCGTGTT TTATGTATAA TTTTCAGCTT GTTACATTAT ATGCCACTCT 720 
AGGAGGTCCA GCCATTGTTC ATGCATTAAA TG AAACAGAG GTGACCAACA TCATTACTAG 780 
TAAAG AACTC TTACAAACAA AGTTGAAGG A TATAGTTTCT TTGGTCCCAC GCCTGCGGCA 840 
CATCATCACT GTTGATGGAA AGCCACCGAC CTGGTCCGAC TTCCCCAAGG GCATCATTGT 900 
GCATACCATG GCTGCAGTGG AGGCCCTGGG AGCCAAGGCC AGCATGGAAA ACCAACCTCA 960 
TAGCAAACCA TTGCCCTCAG ATATTGCAGT AATCATGTAC ACAAGTGGAT CCACAGGACT 1020 
TCCAAAGGG A GTCATG ATCT CACATAGTAA CATTATTGCT GGTATAACTG GG ATGGCAG A 1080 
AAGGATTCCA GAACTAGGAG AGG AAG ATGT CTACATTGG A TATTTGCCTC TGGCCCATGT 1140 
TCTAGAATTA AGTGCTGAGC TTGTCTGTCT TTCTCACGGA TGCCGCATTG GTTACTCTTC 1200 
ACCACAGACT TTAGCAGATC AGTCTTCAAA AATTAAAAAA GGAAGCAAAG GGGATACATC 1260 
CATGTTGAAA CCAACACTGA TGGCAGCAGT TCCGGAAATC ATGGATCGGA TCTACAAAAA 1320 
TGTCATGAAT AAAGTCAGTG AAATGAGTAG TTTTCAACGT AATCTGTTTA TTCTGGCCTA 1380 
TAATTACAAA ATGG AACAGA TTTCAAAAGG ACGTAATACT CCACTGTGCG ACAGCTTTGT 1440 
TTTCCGG AAA GTTCGAAGCT TGCTAGGGGG AAATATTCGT CTCCTGTTGT GTGGTGGCGC 1500 
TCCACTTTCT GCAACCACGC AGCG ATTCAT GAACATCTGT TTCTGCTGTC CTGTTGGTCA 1560 
GGGATACGGG CTCACTGAAT CTGCTGGGGC TGGAACAATT TCCGAAGTGT GGGACTACAA 1620 
TACTGGCAGA GTGGGAGCAC CATTAGTTTG CTGTGAAATC AAATTAAAAA ACTGGGAGGA 1680 
AGGTGGATAC TTTAATACTG ATAAGCCACA CCCCAGGGGT GAAATTCTTA TTGGG GGCCA 1740 
AAGTGTGACA ATGGGGTACT ACAAAAATGA AGCAAAAACA AAAGCTGATT TCTCTGAAGA 1800 
TGAAAATGGA CAAAGGTGGC TCTGTACTGG GGATATTGGA GAGTTTGAAC CCGATGGATG 1860 
CTTAAAGATT ATTGATCGTA AAAAGGACCT TGTAAAACTA CAGGCAGGGG AATATGTTTC 1920 
TCTTGGGAAA GT AG AG GC AG CTTTGAAGAA TCTTCCACTA GTAGATAACA TTTGTGCATA 1980 
TGCAAACAGT TATCATTCTT ATGTCATTGG ATTTGTTGTG CCAAATCAAA AGGAACTAAC 2040 
TGAACTAGCT CGAAAGAAAG GACTTAAAGG GACTTGGGAG GAGCTGTGTA ACAGTTGTGA 2100 
AATGGAAAAT GAGGTACTTA AAGTGCTTTC CGAAGCTGCT ATTTCAGCAA GTCTGGAAAA 2160 
GTTTGAAATT CCAGTAAAAA TTCGTTTGAG TCCTG AACOG TGGACCCCTG AAACTGGTCT 2220 
GGTGACAGAT GCCTTCAAGC TGAAACGGAA AG AGCTTAAA ACACATTACC AGGCGG ACAT 2280 
TGAGCGAATG TATGGAAG AA A ATAA TTATT CTCTTCTGGC ATCAGTTTGC TACAGTGAGC 2340 
TCACATCAAA TAGG AAAATA CTTGAAATGC ATGTCTCAAG CTGCAAGGCA AACTCCATTC 2400 
CTCATATTAA ACTATTACTT CTCATGACGT CACCATTTTT AACTGACAGG ATTAGTAAAA 2460 
CATTAAG AC A GCAAACTTGT GTCTGTCTCT TCTTTC ATTT TCCCCGCCAC CAACTTACTT 2520 
TACCACCTAT GACTGTACTT GTCAGTATG A GAATTTTTCT G AATCATATT GGGGAAGCAG 2580 
TGATTTTAAA ACCTCAAGTT TTTAAACATG ATTTATATGT TCTGTATAAT GTTCAGTTTG 2640 
TAACTTTTTA AAAGTTTGGA TGTATAGAGG GATAAATAGG AAATATAAGA ATTGGTTATT 2700 
TGGGGGCTTT TTTACTTACT GTATTTAAAA ATACAAGGGT ATTGATATGA AATTATGTAA 2760 
ATTTCAAATG CTTATGAATC AAATCATTGT TG AACAAAAG ATTTGTTGCT GTGTAATTAT 2820 
TGTCTTGTAT GCATTTGAGA GAAATAAATA TACCCATACT TATGTTTTAA GAAGTTGAGA 2880 
TCTTGTGAAT ATATGCCTGT CAGTGTCTTC TTTATATATT TATTTTTTAT TAGAA AAAAT 2940 
GAAGTTTGGT TGGTGATGCA TGAAACAAAA TAGCAAGAGA GGGTTATAGT TTAATAGTAA 3000 
GGGAG ATAAC ACAGCATGTG TAGCACCAGT TGATAATTGG TCTCTAGTAG CTTACTGTCA 3060 
AAATGTTCAA TGAAGTCTTC TGTTCATCTG TTGAAACTAG GAAAATACCC AAACTTAAAT 3120 
GGAAGAATTC TGAAAGAGAG GATAGAATTT AAAGAACAAG AGTATATAAA GTTATTCTTT 3180 
GAATATTTCG TTGACTATAT GTACATTG AG TTATCTATAT TTGTAAACAA ATTAGTCATG 3240 
GAAAATTATT CTATTCCAAA GTCTCCTTTT AGTCTAG ATA ATCATTATTT CATTTTAAAA 3300 
TTAGTGTTTT TCATAGTTTG CACTGATGCG TGTATGGATG TGTGTGAGTC AGTGGTAGCT 3360 
TATTTAAAAA GCACCTTATC CTTTCTCCCA TAACCTTTGT ACACTAAAAA ATGAAAGAAT 3420 
TTAGAATGTA TTTG ATGATA GCATTCTCAC TAAGACACAT GAGAATTTAA CTTTATAACC 3480 
GCGTGAGTT A AGATTTAATT CATAGGTTTT GATGTCATTG TTGAAGTTAT TTGTAATTCA 3540 
GAAACCTTGC TTGTGTG ATA CATAGTAAGT CTCTTCATTT ATTACTGCTT GCCTGTTGTT 3600 
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ATATCTGGAT TATCAAAAGC AATAGTGCAC C AATTAAGAT GTGCTC AAAT CAGGACTTAA 3660 
ATCATAGGCA CCACATTTTT CATGTCAG AC TAGTTACTTT GTTGATTCTC AGTTACTGTA 3720 
GGCATCAAAA GGCAAAAATC A 



SEQIDNO:169 P E?1 Pro t ein s equen ce; 
Protein Accession!: NP.004448.1 

1 11 21 31 41 51 
I I I I I I 

MNNHVSSKPS TMKLKHTINP ILLYFIHFLI SLYTILTYIP FYFFSESRQE KSNRIKAKPV 60 
NSKPDSAYRS VNSLDGLASV LYPGCDTLDK VFTYAKNKFK NKRLLGTREV LNEEDEVQPN 120 
GKIFKKVILG QYNWLSYEDV FVRAFNFGNG LQMLGQKPKTNIAIFCETRA EWMIAAQACF 180 
MYNFQLVTLY ATLGGPAIVH ALNETEVTNI ITSKELLQTK LKDIVSLVPR LRHJJTVDGK 240 
PPTWSDFPKG IIVHTMAAVE ALGAKASMEN QPHSKPLPSD IAVIMYTSGS TGLPKGVMIS 300 
HSNHAGITG MAERIPELGE EDVYIGYLPL AHVLELS AEL VCLSHGCRIG YSSPQTLADQ 360 
SSKIKKGSKG DTSMLKPTLM AAVPEIMDRI YKNVMNKVSE MSSFQRNLFI LAYNYKMEQI 420 
SKGRNTPLCD SFVFRKVRSL LGGNIRLLLC GGAPLSATTQ RFMNICHXP VGQGYGLTES 480 
AGAGT1SEVW DYNTGRVGAP LVCCEIKLKN WEEGGYFNTD KPHPRGEILI GGQS VTMGYY 540 
KNEAKTKADF SEDENGQRWL CTGDIGEFEP DGCLKIIDRK KDLVKLQAGE YVSLGKVEAA 600 
LKNLPLVDNI CAYANS YHSY VIGFWPNQK ELTELARKKG LKGTWEELCN SCEMENEVLK 660 
VLSEAAISAS LEKFEIPVKI RLSPEPWTPE TGLVTD AFKL KRKELKTHYQ ADIERMYGRK 

SEQ ID N0:170 PCQ7 ONA SEQUENCE 

Nucleic Acid Accession #: none found 

Coding sequence: 38-1075(underiined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I i I 

AGCAACGACG CCGGGCAGCG GGAGCGGCGG CCGCGCCATG T6GCTGCTGG GGCCGCTGTG 60 

CCTGCTGCTG A6CAGCGCC6 CGGAGAGCCA GCTGCTCCCC GGGAACAACT TCACCAATGA 120 

GTGCAACATA CCAGGCAACT TCATGTGCAG CAATGGACGG TGCATCCCGG GCGCCTGGCA 180 

GTGTGACGGG CTGCCTGACT GCTTCGACAA GAGTGATGAG AAGGAGTGCC CCAAGGCTAA 240 

GTCGAAATGT GGCCCAACCT TCTTCCCCTG TGCCAGCGGC ATCCATTGCA TCATTGGTCG 300 

CTTCCGGTGC AATGGGTTTG AGGACTGTCC CGATGGCAGC GATGAAGAGA ACTGCACAGC 360 

AAACCCTCTG CTTTGCTCCA CCGCCCGCTA CCACTGCAAG AACGGCCTCT GTATTGACAA 420 

GAGCTTCATC TGCGATGGAC AGAATAACTG TCAAGACAAC AGTGATGAGG AAAGCTGTGA 480 

AAGTTCTCAA GAACCCGGCA GTGGGCAGGT GTTTGTGACT TCAGAGAACC AACTTGTGTA 540 

TTACCCCAGC ATCACCTATG CCATCATCGG CAGCTCCGTC ATTTTTGTGC TGGTGGTGGC 600 

CCTGCTGGCA CTGGTCTTGC ACCACCAGCG GAAGCGGAAC AACCTCATGA CGCTGCCCGT 660 

GCACCGGCTG CAGCACCCTG TGCTGCTGTC CCGCCTGGTG GTCCTGGACC ACCCCCACCA 720 

CTGCAACGTC ACCTACAACG TCAATAATGG CATCCAGTAT GTGGCCAGGC AGGCGGAGCA 780 

GAATGCGTCG GAAGTAGGCT CCCCACCCTC CTACTCCGAG GCCTTGCTGG ACCAGAGGCC 840 

TGCGTGGTAT GACCTTCCTC CACCGCCCTA CTCTTCTGAC ACGGAATCTC TGAACCAAGC 900 

CGACCTGCCC CCCTACCGCT CCCGGTCCGG GAGTGCCAAC AGTGCCAGCT CCCAGGCAGC 960 

CAGCAGCCTC CTGAGCGTGG AAGACACCAG CCACAGCCCG GGGCAGCCTG GCCCCCAGGA 1020 

GGGCACTGCT GAGCCCAGGG ACTCTGAGCC CAGCCAGGGC ACTGAAGAAG TATAAGTCCC 1080 

AGTTATTCCA AAGTCCATAT GGGTTAATCT GCTCTGACTT GTTGCCATTC TAACAATTTG 1140 

TGCTCATGGG AAGCTCTTTA AGCACCTGTA AGGATGTCTC AAGTTACAGT TTGGGATATT 1200 

AACTATCTCT GCATTCCCCT CCTCCCCCAG ACTTCAGAGA TGTTTTTCTG GCGTCTCAGT 1260 

TGACATGATC TGTTGTGCGT CTTTTCTGTC AGGTCACTCT TCCCTTGGGA CCCGAGATCA 1320 

CACCCTCATT TTTCACATTA TTCTGTTTCT GTTGGAGAGA CAGCATATAA AACAGTATTG 1380 

AAATAGGCTG GGAGAGAGCA ATGTTTCTGT GCTATATTGG ATGCTCAGAA GTGCAGGAGA 1440 

CGCTGGACCC AATTCTCTCT GCTGGGTAGT TACCTTATAG CATTTGGGGA TTTGGGTTAG 1500 

ATGATCTAAC CAGGAGGCCA TCACTGGATG GTCACCCCCC CAAAAAAATT CCATTTGAGC 1560 

ATCAAAACCT GCTTTGCACA ATCCTATTTG ATCCCCCCAG TTCAGCAGAG TCAGTGGCCA 1620 

AAGAAAACTT TGGACGTGAG TAACACCCTT CAGCAGTCGC AACGTTATTT TGGTTTTGTG 1680 

AAGGACTCTG AAACCATCTA CCCTGTATAA ATTCTGGCTT TAGAAATTTG CCCAAGAATG 1740 

CTCATTCTGA GAGCTTTCCT CAGCAGCATA TATCATCAGC CTCATCCTAA AATAGGCAGG 1800 

GAGCCCCTCC CATGAGTTTA TCCAAGTTCT CAGCTCCTAA AATGCAGGCT GCCAAGACCC 1860 

TACACCTGCC CTGGCTCTAC AGCCACTTAC CTGGTTTCTG GACTGTCACC CTCCCAGCTG 1920 

ACCTGCCCGT AGCCAAGGAA TGAGGACCTA ACTTGAGTTG GCCCAAAGTC TGACCTGGCT 1980 

GTATGTCCCT GTGGCCCACA CCCAGCCTGT CTTGCTCATT CATGCAGCCT CAACACTGGC 2040 

CTCCAAAGTT CCCTTAACAC TTGCAAAGTC CTTTTTACCT GTGCATTTGG ACTTGAGGAC 2100 

ACTGGTTTCT ATCACAGGTG AGAGCCATGT TCAATACCTC CAGCAAGCTC TCCTGGCTCC 2160 

CTGCACTGTG CACGCTCCTC TTCCCAAGGT CCCAATACCA GCACCTCTAG TTAGAGTTAG 2220 

GGTCAGGGTC AGGCCTCTCC CAACATCCCA GTAGTTTCTC CTCTGAGACA CATGGGCAAG 2280 

AGACAATTTG GAGTCAAGAT TTTCCATTTG GATCTATTTT AAATCTTTTA GAAATGCATT 2340 

TGAAACAGTG TGTTTGTTTT TTCCCTTCTA GTTAAGGGAC TATTTATATG TGTATAGGAA 2400 

AGCTGTCTCT TTTTTTGTTT TTCCTTTAAC AAGGTCCAAA GAAAGATGCA AAAGGAGATC 2460 

ACACCCTTGC CCCGCTGAGC CCCGTGATAA CAAGTCACTC CAGACTAACC TGTGTGCCAG 2520 

ACATTTGTGC ATTGTTGCAC TTTGAGGTTA TTATTTATCA AGTTCTTGAA GGAAGCAGAA 2580 

AGAGGGACTC CTCTCTCCCT CCGTGTATAG TCTCTATGTT TGTGCTAGTT TTTCTTTTTT 2640 

TTCTCTGTGT CCAGTCAGCC ACAGGGCCCG CCTCCCTGCA GGAATAAGGG GTAAAACGTT 2700 

AGGTGTTGTT TGGCAAGAAA CCACACTGAC TGATGAGGGG TAAAATGGAA CCAGGTAGAG 2760 

CCACTCCGGG CAGCTGTCAC CCATTCAGAA CTTCTTTCCG CAGCTGAAGA AATGTTCAGT 2820 

AACCTGTTTG ACGCTAATTA AAACAGAGCC TGCAGGAAGT GGGGCTAAAG TGGCATTCAG 2880 

TGATCCTGTT CTGTAGACTT TTCTTTCTTT TTTTAACCAA ATCCAAAGGA TGTTACAGAA 2940 
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AAGCTAGCCA CTGGTATTTT GTTTTGTTTA AAAAAAAAAA GAAAGAAAGA AAGAAAGAAA 3000 

AACGGAAAGG AACCTAGCTG CCTGTATCTT TCATTTTTAA AATAGCACTT GAGTTATTTT 3060 

CTGAGTAATC CAATAAAGAA CTTTTGATGA CAGCCAGAAT GTGTTAGAAC TCTGGCTGAA 3120 

CATTTCATCT CCTGTGAGTC AGAAGGGCTT TATTTCTCCC TTTGATGGGG CCCCTTCTTC 3180 

TTTCTGGTGC TCTGGAAGTT GTTTAGAGGA AAGAATTCTA ATTTTAATTA ATTGCGCAGT 3240 

GAGTTAATCT CACTCGCTTT TCTGCTTCCA GGCATCTTAG GAAAAACAAA TGGTTTTAGT 3300 

AGATAAGGGA TGCCTACTAA TGCTTTTTTA AAACAAACAG GGACATTTTT ATTATAGATT 3360 

TGATTTTTTT AATGAATGTT TTTAAAAATA TATAAATAGG ACACCAAAGC GGCAGGGTTT 3420 

TTTTTGGGGG GAGGGGGTTT GTTTTCCAAC TCAAGATGGC ACATTAGTGG CCAGCAATAT 3480 

TTTTTAACTC ATTCCAACCA GGAAGCTTTT TTATACATTG CCTAAATCTA CGCCAACCAG 3540 

AAAATAGTCT CATCTCTTTT TTTCTCAAAT GAGATCCGTG TTTTATTTTA GCATTAAATT 3600 

AGTTACACTG TGATGACTGG CCTATTACCT GACTCAGCTC CCTCTACCTT GAAATTGACA 3660 

TTTTTAAAAA ATGCAACTAA GTGGTTAATA GTGTGTGACG CTCAAAGTTA ATGTAAACTG 3720 

GAAAGGTTGT GTGTCGTTGC TTTTTGTGTT TTGGTTAGGC TTGGTTTTGT TTTTTAATTT 3780 

TTATACTTTC TAATAAATTT GCAGTTTCAT TCTTTCTGTT TGTGCAAAWG GWMCTAMJVRM 3840 

AAMMAAAAAC AWYWTTGGGG GGGCTTGGGC CTCGGAAAAA GTTTTTAACA CCACTTCGGG 3900 

TGGGGCGGCG GGGCCCACGT AGGTACGGCG ACCACGCGGG CCCAAACGGG ACCCCAGAAG 3960 

GAAACCCTGG CCAAGAAAAA GGTGGCGAGA ATTCTCCACA CCAGAAAAAA ACGCGCCGGG 4020 

GGAAACCGCA GAGTGTTGCG TAAACCACAC CCGAAGAGAG AACTCAGAAG CACACAAGCG 4080 
GGACTCAACC AGGAGGACCC AAGGGAACCC GATAGAGTAC G 



SEQ [D NQ:171 PCQ7 protein, sequence; 

Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

MWLLGPLCLL LSSAAESQLL PGNNFTNECN IPGNFMCSNG RCIPGAWQCD GLPDCFDKSD 60 
EKECFKAKSK CGPTFFPCAS GIHCIIGRPR CNGFEDCPDG SDEENCTANP LLC STAR YHC 120 
KNGLCIDKSF ICDGQNNCQD NSDEESCESS QEPGSGQVFV TSENQLVYYP SITYAIIGSS 180 
VTFVLWALL ALVLHHQRKR NNLMTLPVHR LQHPVLLSRL WLDHPHHCN VTYNVNNGIQ 240 
YVASQAEQNA SEVGSPPSYS EALLDQRPAW YDLPPPPYSS DTESLNQADL PPYRSRSGSA 300 
NSASSQAASS LLSVEDTSHS PGQPGPQEGT AEPRDSEPSQ GTEEV 



SEQ ID NO:172 PEL3 DNA SEQUENCE 
Nucleic Add Accession*: NMJJ05656.1 

Coding sequence: 57-1535 (undefined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

GTCATATTGA ACATTCCAGA TACCTATCAT TACTCGATGC TGTTGATAAC AGCAAGATGG 60 

CTTTGAACTC AGGGTCACCA CCAGCTATTG GACCTTACTA TGAAAACCAT GGATACCAAC 120 

CGGAAAACCC CTATCCCGCA CAGCCCACTG TGGTCCCCAC TGTCTACGAG GTGCATCCGG 180 

CTCAGTACTA CCCGTCCCCC GTGCCCCAGT ACGCCCCGAG GGTCCTGACG CAGGCTTCCA 240 

ACCCCGTCGT CTGCACGCAG CCCAAATCCC CATCCGGGAC AGTGTGCACC TCAAAGACTA 300 

AGAAAGCACT GTGCATCACC TTGACCCTGG GGACCTTCCT CGTGGGAGCT GCGCTGGCCG 360 

CTGGCCTACT CTGGAAGTTC ATGGGCAGCA AGTGCTCCAA CTCTGGGATA GAGTGCGACT 420 

CCTCAGGTAC CTGCATCAAC CCCTCTAACT GGTGTGATGG CGTGTCACAC TGCCCCGGCG 480 

GGGAGGACGA GAATCGGTGT GTTCGCCTCT ACGGAOCAAA CTTCATCCTT CAGATGTACT 540 

CATCTCAGAG GAAGTCCTGG CACCCTGTGT GCCAAGACGA CTGGAACGAG AACTACGGGC 600 

GGGCGGCCTG CAGGGACATG GGCTATAAGA ATAATTTTTA CTCTAGCCAA GGAATAGTGG 660 

ATGACAGCGG ATCCACCAGC TTTATGAAAC TGAACACAAG TGCCGGCAAT GTCGATATCT 720 

ATAAAAAACT GTACCACAGT GATGCCTGTT CTTCAAAAGC AGTGGTTTCT TTACGCTGTT 780 

TAGCCTGCGG GGTCAACTTG AACTCAAGCC GCCAGAGCAG GATCGTGGGC GGTGAGAGCG 840 

CGCTCCCGGG GGCCTGGCCC TGGCAGGTCA GCCTGCACGT CCAGAACGTC CACGTGTGCG 900 

GAGGCTCCAT CATCACCCCC GAGTGGATCG TGACAGCCGC CCACTGCGTG GAAAAACCTC 960 

TTAACAATCC ATGGCATTGG ACGGCATTTG CGGGGATTTT GAGACAATCT TTCATGTTCT 1020 

ATGGAGCCGG ATACCAAGTA CAAAAAGTGA TTTCTCATCC AAATTATGAC TCCAAGACCA 1080 

AGAACAATGA CATTGCGCTG ATGAAGCTGC AGAAGCCTCT GACTTTCAAC GACCTAGTGA 1140 

AACCAGTGTG TCTGCCCAAC CCAGGCATGA TGCTGCAGCC AGAACAGCTC TGCTGGATTT 1200 

CCGGGTGGGG GGCCACCGAG GAGAAAGGGA AGACCTCAGA AGTGCTGAAC GCTGCCAAGG 1260 

TGCTTCTCAT TGAGACACAG AGATGCAACA GCAGATATGT CTATGACAAC CTGATCACAC 1320 

CAGCCATGAT CTGTGCCGGC TTCCTGCAGG GGAACGTCGA TTCTTGCCAG GGTGACAGTG 1380 

GAGGGCCTCT GGTCACTTCG AACAACAATA TCTGGTGGCT GATAGGGGAT ACAAGCTGGG 1440 

GTTCTGGCTG TGCCAAAGCT TACAGACCAG GAGTGTACGG GAATGTGATG GTATTCACGG 1500 

ACTGGATTTA TCGACAAATG AAGGCAAACG GCTAATCCAC ATGGTCTTCG TCCTTGACGT 1560 

CGTTTTACAA GAAAACAATG GGGCTGGTTT TGCTTCCCCG TGCATGATTT ACTCTTAGAG 1620 

ATGATTCAGA GGTCACTTCA TTTTTATTAA ACAGTGAACT TGTCTGGCTT TGGCACTCTC 1680 

TGCCATACTC TGCAGGCTGC AGTGGCTCCC CTGCCCAGCC TGCTCTCCCT AACCCCTTGT 1740 

CCGCAAGGGG TGATGGCCGG CTGGTTGTGG GCACTGGCGG TCAATTGTGG AAGGAAGAGG 1800 

GTTGGAGGCT GCCCCCATTG AGATCTTCCT GCTGAGTCCT TTCCAGGGGC CAATTTTGGA 1860 

TGAGCATGGA GCTGTCACTT CTCAGCTGCT GGATGACTTG AGATGAAAAA GGAGAGACAT 1920 

GGAAAGGGAG ACAGCCAGGT GGCACCTGCA GCGGCTGCCC TCTGGGGCCA CTTGGTAGTG 1980 

TCCCCAGCCT ACTTCACAAG GGGATTTTGC TGATGGGTTC TTAGAGCCTT AGCAGCCCTG 2040 

GATGGTGGCC AGAAATAAAG GGACCAGCOC TTCATGGGTG GTGACGTGGT AGTCACTTGT 2100 

AAGGGGAACA GAAACATTTT TGTTCTTATG GGGTGAGAAT ATAGACAGTG CCCTTGGTGC 2160 
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PCT/US01/32045 



GAGGGAAGCA ATTGAAAAGG AACTTGCCCT GAGCACTCCT GGTGCAGGTC TCCACCTGCA 2220 

CATTGGGTGG GGCTCCTGGG AGGGAGACTC AGCCTTCCTC CTCATCCTCC CTGACCCTGC 2280 

TCCTAGCACC CTGGAGAGTG AATGCCCCTT GGTCCCTGGC AGGGCGCCAA GTTTGGCACC 2340 

ATGTCGGOCT CTTCAGGCCT GATAGTCATT GGAAATTGAG GTCCATGGGG GAAATCAAGG 2400 

ATGCTCAGTT TAAGGTACAC TGTTTCCATG TTATGTTTCT ACACATTGAT GGTGGTGACC 2460 
CTGAGTTCAA AGCCATCTT 



10 



SEQ ID NO:173 PE L3 Protein seouenco: 
Prolan Accession t. 



NP 005647.1 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



HALNSGSPPA 
SNPWCTQPK 
DSSGTCINPS 
GRAACRDMGY 
CLACGVNLNS 
PLNNPWHWTA 
VKPVCLPNPG 
TPA2HCAGFL 
TDWIYRQMKA 



11 
I 

IGPYYENHGY 
SPSG1VCTSK 
NWCDGVSHCP 
KNNFYSSQGI 
SRQSRXVGGE 
FAGILRQSFM 
MMLQPEQLCW 
QGNVDSCQGD 
KG 



21 
I 

QPENPYPAQP 
TKKALCITLT 
GGEDENRCVR 
VDDSGSTSFM 
SALPGAWPWQ 
FYGAGYQVQK 
ISGWGATEEK 
SGGPLVTSNN 



31 
I 

TWPTVYEVH 
LGTFLVGAAL 
LYGPNFILQM 
KLNTSAGNVD 
VSLHVQNVHV 
VISHPNYDSK 
GKTSEVLNAA 
NIWWLIGDTS 



41 
I 

PAQYYPSPVP 
AAGLLWKFKG 
YSSQRKSWHP 
IYKKLYHSDA 
CGGSIITPEW 
TKNNDIALMK 
KVLLIETQRC 
WGSGCAKAYR 



51 

I 

QYAPRVLTQA 
SKCSNSGIEC 
VCQDDWNENY 
CSSKAWSLR 
IVTAAHCVEK 
LQKPLTPNDL 
NSRYVYDNLI 
PGVYGNVMVF 



60 
120 
180 
240 
300 
360 
420 
480 



Nucleic Acid Accession*: 



AI694767 
Coding sequence: 



SEQ 10 NO:174 PBJ4DNA SEQUENCE 



130-1086 (undented sequences correspond to start and stop codons) 



l 
I 

CAGAGAGGCT 
GGGGTCACAC 
AGCTTCTTCA 
ATAGGCCTCC 
TACCTTATTG 
CTGCATGAGC 
ACCTCATCCA 
GATGCTTGTC 
CTGCTGGOCA 
GTACTTACGT 
CTGATGGCAC 
TCCCATTCCT 
AATGTCGTCT 
TCCTTCTCAT 
AAGGCATTT6 
ATTGGATTGT 
TTGGCCAATA 
ACAAAGGAGA 
CC CTAGG TGT 
GTTAACATTT 
ATCCTTCAAA 

TTTTCATTTT 
GAGATAAGAA 
TAAACACAGA 
ACTCCCAACC 
AAATAATTTT 
AGAGTACATT 
ATGGACCCTG 
TTAGTACCCT 
GGGGTCATAC 
GGAAGAACTG 
TTCTARAGGA 
GCAACAGAAC 
AATTACCTGT 
AGAAAGTCTG 
TGATAGGCAG 
TGAAGATAAC 
ACCATGCTTT 
ATCTGACTTA 
ATAGGTTTCA 
TACTAAAACA 
CCTGATATGG 
AATGCCTATT 
TATTGAATGT 
AAAGTGCCTA 
TTCCTTCTGT 
TTAAATTTTA 
GCTCATAAAA 



11 
I 

GTATTTCAGT 
ATTCCTTCCA 
TGATGGTGGA 
CTGGTTTAGA 
CTGTGCTAGG 
CCATGTATAT 
TGCCCAAAAT 
TGCTACAGAT 
TGGCTTTTGA 
TGCCTCGTGT 
CCCTTCCTGT 
ACTGCCTACA 
ATGGCCTTAT 
ATCTGCTTAT 
GCACTTGCGT 
CCATGGTGCA 
TCTATCTGCT 
TTCGACAGCG 
CAG7GATCAA 
TGGAAGACAG 
TATGAAACTG 
TACATATAAT 
ACCATGCAGT 
TGGTACATCT 
ATATAATAAA 
ACATTGGATC 
TCCTCTGGAC 
TACCTACGTT 
TTTTTCCTAT 
CATTGTAGCC 
AAGTATAAAA 
TTAAAGAGAC 
GGTATTTAAT 
TCATGGCTTT 
GTCTTGGAAG 
CATAGGGCTT 
TGAGGTTAGG 
ATTGGCCTTT 
ATTTGGGGCT 
GGCATGGGAA 
TCTTCAACAG 
TGTGATCATA 
ATTCCTATNA 
TAATACTTGT 
CATCTCTGTT 
GAACATAATA 
GCTGAACACA 
GCCATTACTT 
CCCTCCCATG 



21 
I 

GCAGCCTGCC 
TACGGTTGAG 
TCCCAATGGC 
AGAGGCTCAG 
TAACTTGACA 
ATTTCTTTGC 
GCTGGCCATC 
GTTTGCCATC 
CCGCTATGTG 
CACCAAAATT 
CTTCATCAAG 
CCAAGATGTC 
CGTCATCATC 
TCTTAAGACT 
CTCTCATGTG 
TCGCTTTAGC 
GGTTCCTCCT 
CATCCTTCGA 
ACTTCTTTTC 
TATTCAGAAA 
GTTGGGGAAT 
TATTAATACC 
CCAAATCTAA 
AGAGAACATT 
ATGAGATAAT 
TCAGAAAAAT 
ACTAGCACTT 
AATGAAAGTT 
TTAATTTTCT 
ATGGGAAAAT 
ATTAAAAAAA 
CAACAGGGTA 
TTCTTCTCAC 
AATCCCACTA 
AAGTGATTTC 
ATAGCAAGTT 
GAGCCACCAG 
TGAGTGTGAC 
TTGTGCAGTA 
TCAGGCATTT 
GATATGACAA 
TATGTGGTAA 
CATGCTTTCA 
ATTTGCTGCT 
CATCATTGAC 
GTGCTTATGC 
TAGCCAGGCA 
CCAATGTGAG 
TGCAGCCTTT 



31 
I 

agacctcttc 
cctctacctg 
aatgaatcca' 

TTCTGGTTGG 
ATCATCTACA 
ATGCTTTCAG 
TTCTGGTTCA 
CACTCCTTAT 
GCCATCTGTC 
GGTGTGGCTG 
CAGCTGCCCT 
ATGAAGCTGG 
TCCGCCATTG 
GTGTTGGGCT 
TGTGCTGTGT 
AAGCGGCGTG 
GTGCTCAACC 
CTTTTCCATG 
CATTCAGAGT 
AAAAATTTCC 
CTCCATTTTT 
CTGACTAGGT 
ACTGCTTCTA 
TGCCAAAGGC 
CTAGCTTAAA 
ACTGTCTTCA 
AAGGGGAAGA 
GACACACTGT 
TATCAACCCT 
TGATGTTCAG 
AAAGACTTCA 
GTGGGTTAGA 
TCATCCAGTG 
GCTATTGCTT 
TAGGTTCACC 
ATTTATTTTT 
TTATGATGGG 
TCGTAGCTGG 
TGGAACAGGG 
TTGCTTCTGA 
CAGTCTTAAC 
GTTTCATTTT 
TCCCCTTTTG 
GGACTGTAAG 
TGCTCTTTGC 
TTGACACCGG 
ATTTTCCAGC 
TGGAAGTGAC 
CATGTTGACA 



41 

I 

TGGAGGAAGA 
CCTGGTGCTG 
GTGCTACATA 
CCTTCCCATT 
TTGTGCGGAC 
GCATTGACAT 
ATTCCACTAC 
CTGGCATGGA 
ACCCACTGCG 
CTGTGGTGCG 
TCTGCCGCTC 
CCTGTGATGA 
GCCTGGACTC 
TGACACGTGA 
TCATATTCTA 
ACTCTCCACT 
CAATTGTCTA 
TGGCCACACA 
CCTCTGATTC 
TTAATAAAAA 
TCAATATTAT 
TGTGGTTGGA 
CTGATGGTTT 
CTAAGCACAG 
ACTATAACTT 
AAATGACTTC 
TTGGAAGTAA 
TCTGAGAGTT 
TTAATTAGGC 
TGGGGATCAG 
TGCCCAATCT 
GATTTCCAGA 
TTGTATTTAG 
ATTGTCCTGG 
ATTATGGAAG 
AAAAGTTCCA 
AAGTATGGAA 
AAAGTGAGGG 
ACTTTGAGAC 
GGGGCTATTA 
CAAGAAACTC 
CTTTTTCAAT 
TAATGGATAT 
CCCATGAGGG 
TCATCATTGA 
TTATTTTTCA 
CTTCTTTGAG 
ATGTGCAATT 
TTAAATGTGA 



51 



CTGGACAAAG 
GTCACAGTTC 
CTTCATCCTA 
GTGCTCCCTC 
TGAGCACAGC 
CCTCATCTCC 
CATCCAGTTT 
ATCCACAGTG 
CCATGCCACA 
GGGGGCTGCA 
CAATATCCTT 
TATCCGGGTC 
ACTTCTCATC 
AGCCCAGGCC 
TGTACCTTTC 
GCCCGTCATC 
TGGAGTGAAG 
CGCTTCAGAG 
AGATTTTAAT 
TACAACTCAG 
TTTCTTCTTT 
GGGTTATTAC 
ACAGCATTCT 
CAAAGGAAAA 
CCTCTTCAGA 
TACAGAGAAG 
AGCCTTGAAA 
TTCACAGCAT 
AAAGATATTA 
TGAATTAAAT 
CATATGATGT 
GTCTTACATT 
GAATTTCCTG 
TCCAATTGCC 
ATTCTTATTC 
TAGGTGTTTC 
TGGCAGGTGT 
AATCTTCAGG 
CGGGAAAGCA 
CCAAGGGTTA 
AAATTACATA 
CCTCAGGTTC 
CATATTTGGA 
CACTGTTTAT 
ATCCCCCAGC 
TCAAACCTGA 
TTGGGTATTA 
TTTATACCTG 
CTTGGGAAGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
'1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 



372 



WO 02/30268 



PCT/US01/32045 



TATGTGTTAC ACAGAGTTAA TTAACCNGAA AGGCCTGGNA ATTTTTTGNN AANNAAACTG 3000 
TGGCCNNGAG GCCCNCAACC CTTTTTNNNA ATTTGGCAAN NTCCCACTTT GTANTTTGGT 3060 
AAGGAGGCCA GTTGGATAAG TGAAAAATAA AGTACTATTG TGTC 



5 

10 

15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



Protein Accession*: 



11 



21 



MVDPNGNESS ATYFILIGLP GLEEAQFWLA 
MYIFLCMLSG IDILISTSSM PKMLAIFWFN 
AFDRYVAICH PLRHATVLTL PRVTKIGVAA 
CLHQDVMKLA CDDIRVNWY GLIVIISAIG 
TCVSHVCAVF IFYVPFIGLS MVHRFSKRRD 
RQRILRLFHV ATHASEP 



Nucleic Acid Accession ft 
Coding sequence: 



SEQ ID NO:175 PBJ4 PR0TCIN SEQUENCE 
not available, cloned at Eos 



31 41 51 

I I I 

FPLCSLYLIA VLGNLTIIYI VRTEHSLHEP 60 

STTIQFDACL IiQMFAIHSLS GMESTVLLAM 120 

WRGAALMAP LPVFIKQLPF CRSNILSHSY 180 

LDSLLISFSY LLILKTVLGL TREAQAKAFG 240 

SPLPVILANI YLLVPPVLNP IVYGVKTKEI 300 



SEQ ID NO:176 PM72 DMA SEQUENCE 
NMJXM624.1 

57-1544 (underlined sequences conespond to start and stop codons) 



TCGGAGCCTG 
CTCCTCCTCC 
TGGTGGTCGC 
GCGGCGGCGG 
CGCTCTTGGG 
ACAAGCAGTG 
GGGACAACCT 
CCCTCATCTT 
ACGAAGGCTG 
AGGCAGCGAG 
CCATTGGCTA 
TCAGGAAGCT 
TGAGGGCTGC 
AGTGCTCCGA 
TGGCTAACTT 
CCTTCTTCTC 
GCACATTCAC 
GGTGCTGGGA 
CCATCTTGGT 
GGCCCCCAGA 
TCCTGCTGAT 
TTAAGCCTGA 
TGGCTATCCT 
GGCGCTGGCA 
GCAGCAACGG 
CCCGCCGCTC 
CCAAGCGGCC 
GGGCGCGCCA 
GGACACTCCT 
GATGGGAGCT 
AGGCCCCCTA 
TGCTGGCTCT 
TGACCTGAGG 
CCTGAAATTT 
GACTGAAGAT 
GTGGGTTATT 
GTGGACTGGC 
CTGAAGCCTC 
TACCTGCTCT 
TTCTTATCTC 
CACCTATGTG 
AAGCAGATCC 
GTGAAAGCAC 
TTATTTGTTT 
CCCTCCCTGG 
CTGGTCACAG 
CCTCTGCCAG 
GGAAAAAAAA 



CGGAGGGTGG 
TCTCCTCTCG 
GGCGGCCGGG 
CCGAGGTGGG 
CTCCTCGCTG 
CCTGGAGGAG 
CACCTGCTGG 
CAAGCTCTTC 
GACGCACCTG 
TTTGGATGAG 



CCACTGCACG 



GGGCTCGGTG 
CTTCTGGCTG 
TGAGCGGAAG 
CATGGTGTGG 
CACCATCAAC 
AAACTTCATC 
TATCAGGAAG 
CCCCCTGTTT 
AGTGAAGATG 
CTACTGCTTC 
CCTGCAGGGC 
CGCCACGTGC 
CTCCAGCTTC 
CCTCCCGCCC 
GCCCCGGCCC 
AGAGAACGCA 



CGCCAATCAA 
TCTGCCCAAT 
GCAGAAAGGT 
CACCATTGCT 
GCAGCTCACT 
CTGGAGTTTT 
CCCTGGGTCA 
TGGGAAATGA 
CCAAGTCTCA 
TCTGTGCTGT 
CCAACTGTTG 
TCACCCTGCT 
GGACTCTTAC 
ACCACTTGTA 
AGTGTGGCTG 
CCTCCTCTGT 
AAGATCCCCT 
AAAA 



TGGTGGTGGT 
CTCAGGCGCC 
GCTCGCTCTC 
GTCGCGCGGC 
CAGGAGGAGT 
GCCCAGCTGG 
CCAGCCACCC 
TCCTCCATTC 
GAGCCTGGCC 
CAGCAGACCA 
CTCGCCACCC 
CGGAACTACA 
ATCAAAGACT 
GGCTGTAAGG 
CTGGTGGAGG 
TACTTCTGGG 
ACCATCGCCA 
TCCTCACTGT 
CTGTTTATTT 
AGTGACAGCA 
GGAGTACACT 
GTCTTTGAGC 
CTCAATGGTG 
GTCCTGGGCT 
AGCACGCAGG 
CAAGCCGAAG 
CTTCCCACTC 
TGGGCTCGGA 
GCCCTAGAGC 
AGGATGCAGG 
GGGCAAAAAG 
TGGAGGAAAG 
TCTGCCCGGG 
GTCAAGTTCC 
ACCCTATTCT 
TGTTTGGAGA 
GTCTGGTGGG 
GAAGGCAGCC 
GTGGCTTCAT 
GGAAGCAACA 
TAACTAGGCT 
ACACATACAG 
TGCTAACTTT 
TTATTAATGC 
AGGAGGCCTC 
CTGCCCTTCA 
CAGGACTGCA 



GGTGGTGGCC 
TCGGTGGCGG 
GGGGAGGCCG 
GGAGGCGGCT 
GTGACTATGT 
AGAATGAGAC 
CTCGGGGCCA 
AAGGCCGCAA 
CGTACCCCAT 
TGTTCTACGG 
TTCTGGTCGC 
TCCACATGCA 
TGGCCCTCTT 
CAGCCATGGT 
GCCTCTACCT 
GGTACATACT 
GGATCCATTT 
GGTGGATCAT 
GCATCATCCG 
GTCCATACTC 
ACATCATGTT 
TCGTCGTGGG 
AGGTGCAGGC 
GGAACCCCAA 
TTTCCATGCT 
TCTCCCTGGT 
GCAGCAGACG 
GGCTGCCCCC 
CTGCCTGGAG 
TGGAACTCAG 
TCTACATACT 
CAACCGGTGG 
AAGGTCACCA 
TTTGGGTTAA 
CTCTTTACGC 
GCACACCTAT 
AGGACGGTGC 
ACCAGCGAAT 
CTGTCAAGTG 
GGAATCAAGA 
CAGAGATGTG 
GATTTGAACT 
TGTGTATCGT 
CATTATCCCT 
CATCTCATGT 
CCCCAGTGGC 
ACAGGCTTGT 



TTGGTCGGCG 
GGGCGGATCT 
CGAGCTTCGT 
GCAGATGATC 
AATAGGCTGC 
GGTAGTTGTC 
TGTAAGCCGC 
TGCCTGTGGT 
TTCTGTGAAG 
CACAGCTATC 
CCTCTTCATA 
CGACAGCGGG 
CTTTTTCCAA 
GTACACCCTG 
CATCGGCTGG 
TGAGGATTAT 
AAAGGGCCCC 
AATCCTGCTT 
AAGGCTAGCC 
CGCCTTCTTT 
GTCTTTCCAG 
GGAGCTGAGG 
ATACCGGCAC 
GACCCGCGTC 
CTGACCACCA 
CCGGGGACAG 
GGCCCCCTGG 
CGTTTCTAGC 
TCATTAGACT 
TTCATCCTGA 
ATCCTCAAAC 
GCACCAACAC 
GCATTACCAC 
TTAGTTATCA 
CTTAGTGGTT 
AACCCAAGGA 
GCTAGGTCTC 
GGACTCTGTC 
GACTGCCCTC 
CACCCATGGG 
CAGATCTGTC 
AACCAGCCAG 
GAATTCCCCT 
ATCATCTGGA 
CACTCAGCTT 
GCAACAATAA 



TCACT CATGC 
GTTACGCGGC 
CGCGGCGCAG 
GCTGCGCGCT 
GAGGTGCAGC 
AGCAAGATGT 
TTGGCCTGTC 
AGCTGCACCG 
TTGGATGACA 
ACCGGCTACA 
CTGAGCCTGT 
TCCTTCATCC 
GAGTCGGACC 
TATTGTGTCA 
CTTGCCGTCT 
GGGGTACCCA 
GGTCTGCTCA 
ATCCTCACCT 
CAGAAACTGC 
AGGTCCACAC 
CCGGACAATT 
GGTTTTGTGG 
CGGAAGTGGC 
CCGTCGGGAG 
AGCCCAGGTG 
GGATCCCAGC 
AGGCCTGCCC 
TCTCTGGTCC 
AAGTGAGAGA 
CCTCCTCCAA 
CTCTGCCCCC 
AACACTGGTG 
CACGGTAGTG 
TCAGGCATTT 
GCTTTTTAAA 
CCCCACCGAA 
CTGAGGGACT 
GGACTAAGCC 
ACACCAGCCA 
CTTGTCCACC 
CTCTGACAGA 
TGATAGGAAT 
ATCCTCTTGG 
TGCCACCCCA 
TAGGAGCCTG 
CCTACCCACA 
ATGTTGGCTT 



SEQ ID NO:177PM72Pfp|e!p sequence; 
Protein Accession #: 



JC2195 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



11 



21 



31 



41 



51 



MPPPPLLSLR RLGGGWSAVT RLWAAAGAR SRGGRGGSRG AGGGGRGGVA RRRRLELRAA 
RSLLGSSLQE ECDYVQMIEV QHKQCLEEAQ LENETIGCSK MWDNLTCWPA TPRGQWVLA 



60 
120 



373 



WO 02/30268 



PCT/US01/32045 



CPLIFKLFSS 
YTIGYGLSLA 
DQCSEGSVGC 
PSTFIMVWTI 
LRPPDIRKSD 
WAILYCFLN 
GARRSSSFQA 



IQGRNVSRSC 
TLLVATAILS 
KAAMVFFQYC 
ARIHFEDYGL 
SSPYSRLARS 
GEVQAELRRK 
EVSLV 



TDEGWTHLEP 
LFRKLHCTRN 
VMANFFWLLV 
LRCWDTINSS 
TLLLIPLFGV 
WRRWHXQGVL 



GPYPIACGLD 
YIHMHLFISF 
EGLYLYTLLA 
LWWIIKGPIL 
HYIMFAFFPD 
GWNPKYRHPS 



DKAASLDEQQ 
XLRAAAVFIK 
VSFFSERKYF 
TSILVNFILP 
NFKPEVKMVF 
GGSNGATCST 



TMFYGSVKTG 
DLALFDSGES 
WGYILIGWGV 
ICIIRILLQK 
ELWGSFQGF 
QVSMLTRVSP 



180 
240 
300 
360 
420 
480 



10 Nucleic Acid Accession*: 

Coding sequence: 



SEQ ID N&178 BFFB DNA SEQUENCE 

AL133619 

1-2070 (underlined sequences correspond to start and stop codons) 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



1 11 21 31 

I I I I 

ATGAGCGGTG OGGGGGTGGC GGCTGGGACG CGGCCCCCCA 
CGGCGCCGGC GCCAGCGCCC CTCTGTGGGC GTCCAGTCCT 
CTCAGGCAGA GCGACCCGCA GAAACGGAAC CTGGACCTGG 
CAGCAGCAGC ACTCGGAGAT GCTGGCCAAG CTCCATGAGG 
GAAAACAAGG GTGAGCCGGC GCGGGGCCCT AGGCCGGCCC 
ACACTGCCGC TCCCGCAGCA CAGAAACACA GCCATCAACT 
GGGGGAACAC AGGACGGGGA GCCCCTCCAG ACTGTCCTTG 
CCTGTATGCC AACCCAGTGG GTACAGGTTC TGGGGGACCT 
AGCCGTGGCT GGACGATGTT ATGCAGCCAA GCACAGCACG 
GGGCCTGAGG TCATTGCAGG GCGGCAGGTG GCCACAGGGT 
CCAAGTAGAG CTGAAATGGG AAGGAACOCC TGGGACAGCC 
CCTCAGATTG CTGCTGTGGC CAGGCCCAGG ATTTCCAGCC 
ATGCTGGGGG CCCAGGGGAT ATGGACACAC TCCATCCAGG 
GCAGCAACCA TGGGGACAAA GGGAGGAAGC AGAGTCCTGT 
GCACTTCCCC ATCCTGACAG CGGCCCCCAC CCAGCCCAGG 
GCTCACTTCC CATTATCTTT GGGGCTGGGG CTGACATCAG 
TGGAGCCAGC CTGGGAACAT CGCAGCTGGG GCAGTGCCTA 
GACATGGAGA AGGGGGTTGA GGGAGGGCCC TTCCCTAGCC 
CTGTTCTGGG CAAAGTGTGG CGCAAGTCGG CAGCCCCAGC 
GACAGGACAC GGGAAGAGGC CATGCTTTCC CTCGGGACCT 
CCCTCCTGCT TTCCAGATGG CCCCTCAGGA AACCACCTTT 
GGCGCTCGCT GGGTCTGCAT CAACGGAGTG TGGGTAGAGC 
AGGCTGAAGG AGGGCTCCTC ACGGACACAC AGGCCAGGAG 
GGCGGTAGCG CCGACACTGT GCGCTCTCCT GCAGACAGCC 
TCTGTCAAGT CCATCTCTAA TTCAGCCAAC TCTCAAGGCA 
TCCTTCAACA AGCAAGATTC AAAAGCTGAC GTCTCCCAGA 
CCCCTACTTC ACAACAGCAA GCTGGACAAA GTTCCTGGGG 
GAGAAAGCAG AGGCCTCTAA TGCAGGAGCT GCCTGTATGG 
AGGCAGATGG GGGCGGGGGC ACACCCCCCA ATGATCCTGC 
ACCACAOTTA GGCAGTGCGA AGTGCTCATC CGCGAGCTGT 
ACCCAAGAGC TGCGGCACCT CAAGTCCCTC CTGGAAGGGA 
CCGGAGGAAG CTAGCTTTCC CAGGGACCAA GAAGCCAGGC 
AAGAGCCTCT CCAAGAAATG CCTGAGCCCA CCTGTGGCGG 
CTGAAGCAGA CCCCGAAGAA CAACTTTGCC GAGAGGCAGA 
AAACGGCGCC TGCATCGCTC AGTGCTTTGA 



41 
I 

GCTCGCCGAC 
TGAGGCCGCA 
AGAAAAGCCT 
AGATCGAGCA 
TGCCTCCCCA 
CCAGCACACG 
CCCACCTGGC 
GGACAGATGC 
TGCTGCTCTC 
GCTCCCCAGA 
CCTGCCCTGC 
CTATGGCTCT 
GATCCCTTCC 
TTCCTTGCCA 
ATCCTGGGCT 
GAGGACATCT 
GGGCTCTCCC 
GCTGTGGCAA 
CCTGCAGTGC 
GCTGTTCCAT 
CCAGGGCCTC 
CGGGAGGACC 
GCAAGCGTGG 
TCTCCATGTC 
AGGCCAGGCC 
AGGCGGACCT 
TACAAGGGCA 
GGAACAGCCA 
CCCTTCCCCT 
GGAATACCAA 
GCCAGAGGCC 
ATTTCCCCAA 
AGCGTGCCAT 
AGAGGCTGCA 



51 
I 

CCCGGGCTCT 
GAGCCCGCAG 
GCAGTTCCTG 
TCTGAAGCGG 
GGCACACTCA 
CCTGGGCTCA 
TGCACTGGCC 
CGCTACCTCT 
GGGAAGCCCA 
CCTCCCTCCT 
TAGATCTTTG 
GAGTCCTCAC 
TGCCATCTGG 
CTTGTCCAAG 
GTGGTCTCAA 
GACTGGTGGA 
TTCCCAGGGA 
CTCCAGTGAG 
TGGGGACGCT 
GTGTCCCAAG 
TGCTCCCTTG 
CAGCCCTGCC 
GCGTCTTGCG 
AAGCTTCCAG 
CCAGCCCGGC 
GGAAGAGGAG 
GGCCAGAAAG 
GCACCAGGGC 
GCGAAAGCCC 
CCTCCTGCAG 
CCAGGCAGCC 
GGTCTCCACC 
CCTGCCCGCA 
GGCAATGCAG 



SEQIDNO:179fi 

Protein Accession*: 



1 
I 

MSGAGVAAGT 
QQQHSEMLAK 
GGTQDGEPLQ 
GPEVTAGRQV 
MLGAQGIWTH 
AHFPLSLGLG 
LFWAKCGPSR 
GARWVCINGV 
SVKSISNSAN 
EKAEASNAGA 
TQELRHLKSL 
LKQTPKNNFA 



11 
I 

RPPSSPTPGS 
LHEEIEHLKR 
TVtiAHLAALA 
ATGCSPDLPP 
SIGjGSLPAIW 
LTSGGHLTGG 
QPQPCSAGDA 
WVEPGGPSPA 
SQGKARPQPG 
ACMGNSQHQG 
LEGSQRPQAA 
ERQKRLQAMQ 



21 
I 

RRRRQRPSVG 
ENKGEPARGP 
PVCQPSGYRF 
PSRAEMGRNP 
AATMGTKGGS 
WSQPGNIAAG 
DRTREEAHLS 
RLKEGSSRTH 
SFNKQDSKAD 
RQMGAGAHPP 
PEEASFPRDQ 
KRRLHRSVL 



T43457 



31 
I 

VQSLRPQSPQ 
RPALPPQAHS 
WGTWTDAATS 
WDSPCPARSL 
RVLFPCHLSK 
AVPRALPSQG 
LGTCCSMCPK 
RPGGKRGRLA 
VSQKADLEEE 
KILPLPLRKP 
EATHFPKVST 



41 
I 

LRQSDPQKRN 
TLPLPQHRNT 
SRGWTMLCSQ 
PQIAAVARPR 
ALPHPDSGPH 
DMEKGVEGGP 
PSCFPDGPSG 
GGSADTVRSP 
PLLHNSKLDK 
TTLRQCEVLI 



51 



I 

LDLEKSLOFL 
AINSSTRLGS 
AQHVLLSGSP 
ISSPMAIiSPH 
PAQDPGLWSQ 
FPSRCGNSSE 
NHLSRASAPL 
ADSLSHSSFQ 
VPGVQGQARK 
RELWNTNLLQ 
PVAERAILPA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



Nucleic Add Accession #: 



SEQ 10 NO:180 BCR4 DNA SEQUENCE 
NH/L012319.2 



Coding sequence: 



138-2405 (underlined sequences correspond to start and stop codons) 



51 



1 11 21 31 41 

I I I I I I 

CTCGTGCCGA ATTCGGCACG AGACCGCGTG TTCGCGCCTG GTAGAGATTT CTCGAAGACA 60 
CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTGGGA CAACGAGGCC 120 



374 



WO 02/30268 



GCGGAGACGA AGGCGCAATG GCGAGGAAGT TATCTGTAAT CTTGATCCTG ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAG ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATGTTGACTT GGCAATTTCC ACACGGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTTGAAGGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAG ATAAGATTAA AAGAATCCAT ATACACCATG 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCTCAC CATAATCATG 540 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACCG ACCAGAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTCC AAGACCTGGA AAACTCTTCC 780 

CCAAAGATGT AAGCAGCTCC ACTCCACCCA GTGTCACATC AAAGAGCCGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCATG 960 

GCATGGGCAT CCAGGTTCCG CTGAATGCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTGGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG .TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAAGAAACC TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTCC AAGTATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATT 2400 

TCTAGTTAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGOGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 



SEQ ID N0:181 BCR4 PROTEIN SEQUENCE 
Protein Accession* NP.036451 



I 11 21 31 41 51 

II I | I I 

MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISPNW ESGINVDLAI STRQYHLQQL 60 

FYRYGENNSL SVEGFRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDPRN SQGRGAHRPE HASGRRNVKD 180 

SVSASEVTST VYNTVSEGTH FLETIETPRP GKLFPKBVSS STPPSVTSKS RVSRLAGRXT 240 

NESVSEPRKG FMYSRNTNEN PQECFNASKL LTSHGMGIQV PLNATEFNYL CPAIINQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISPLSL LGVILVPLMN RVFFKPLLSF 360 

LVALAVGTLS GDAFLHLLPH SHASHHHSHS HEEPAMEMKR GPLFSHLSSQ NIEESAYFDS 420 

TWKGLTALGG LYFMFLVEHV LTLIKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESQLSTN 480 

EEKVDTDDRT EGYLRADSQE PSHFDSQQPA VLEEEEVMIA HAHPQEVYNE YVPRGCKNKC 540 

HSHFHDTLGQ SDDLIHHHHD YHHILHHHHH QNHHPHSHSQ RYSREELKDA GVATLAWMVI 600 

MGDGLHNFSD GLAIGAAFTE GLSSGLSTSV AVFCHELPHE LGDFAVLLKA GMTVKQAVLY 660 

NALSAMLAYL GMATGIFIGH YAENVSMWIF ALTAGLFMYV ALVDMVPEML HNDASDHGCS 720 
RWGYFFLQNA GMLLGFGIML LISIFEHKIV FRINF 



375 
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seq to Nai82BC Y2 P N Asggve ng g 
Nucleic Acid Accession*: 
Coding sequence: 



NM.001203 

274-1782 (underlined sequences correspond to start and stop codons) 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 

60 

65 
70 
75 



1 11 21 31 41 51 
I I I I I I 

CGCGGGGCGC GGAGTCGGCG GGGCCTCGCG GGACGCGGGC AGTGCGGAG A CCGCGGCCCT 60 
GAGGACGCGG GAGCCGGGAG CGCACGCGCG GGGTGGAGTT CAGCCTACTC TTTCTTAGAT 120 
GTGAAAGGAA AGGAAGATCA TTTCATGCCT TGTTGATAAA GGTTCAGACT TCTGCTGATT 180 
CATAACCATT TGGCTCTGAG CTATGACAAG AGAGGAAACA AAA AGTTAAA CTTACAAGCC 240 
TGCCATAAGT GAGAAGCAAA CTTCCTTGAT AACATGCTTT TGCGAAGTGC AGGAAAATTA 300 
AATGTGGGCA CCAAG AAAG A GGATGGTGAG AGTACAGCCC CCACCCCCCG TCCAAAGGTC 360 
TTGCGTTGTA AATGCCACCA CCATTGTCCA GAAGACTCAG TCAACAATAT TTGCAGCACA 420 
GACGGATATT GTTTCACGAT GATAGAAGAG GATGACTCTG GGTTGCCTGT GGTCACTTCT 480 
GGTTGCCTAG GACTAGAAGG CTCAG ATTTT CAGTGTCGGG ACACTCCCAT TCCTCATCAA 540 
AG AAG ATCAA TTG AATGCTG CACAGAAAGG AACG AATGTA ATAAAG ACCT ACACCCTACA 600 
CTGCCTCCAT TGAAAAACAG AGATTTTGTT GATGGACCTA TACACCACAG GGCTTTACTT 660 
ATATCTGTGA CTGTCTGTAG TTTGCTCTTG GTCCTTATCA TATTATTTTG TTACTTCCGG 720 
TATAAAAGAC AAGAAACCAG ACCTCG ATAC AGCATTGGGT TAG AACAGGA TGAAACTTAC 780 
ATTCCTCCTG GAGAATCCCT GAGAGACTTA ATTGAGCAGT CTCAG AGCTC AGGAAGTGGA 840 
TCAGGCCTCC CTCTGCTGGT CCAAAGG ACT ATAGCTAAGC AGATTCAGAT GGTG AAACAG 900 
ATTGGAAAAG GTCGCTATGG GGAAGTTTGG ATGGGAAAGT OGCGTGGCGA AAAGGTAGCT 960 
GTGAAAGTGT TCTTCACCAC AGAGG AAGCC AGCTGGTTCA GAGAG ACAGA AATATATCAG 1020 
ACAGTGTTGA TGAGGCATG A AAACATTTTG GGTTTCATTG CTGCAGATAT CAAAGGGACA 1080 
GGGTCCTGGA CCCAGTTGTA CCTAATCACA GACTATCATG AAAATGGTTC CCTTTATGAT 1140 
TATCTGAAGT CCACCACCCT AGACGCTAAA TCAATGCTGA AGTTAGCCTA CTCTTCTGTC 1200 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 1260 
CATCGAGATC TG AAAAGTAA AAACATTCTG GTG AAGAAAA ATGGAACTTG CTGTATTGCT 1320 
GACCTGGGCC TGGCTGTTAA ATTTATTAGT GATACAAATG AAGTTGACAT ACCACCTAAC 1380 
ACTCGAGTTG GCACCAAACG CTATATGCCT CCAGAAGTGT TGGACGAGAG CTTGAACAGA 1440 
AATCACTTCC AGTCTTACAT CATGGCTG AC ATGTATAGTT TTGGCCTCAT CCTTTGGG AG 1500 
GTTGCTAGGA GATGTGTATC AGGAGGTATA GTGGAAGAAT ACCAGCTTCC TTATCATGAC 1560 
CTAGTGCCCA GTGACCCCTC TTATGAGGAC ATGAGGGAGA TTGTGTGCAT CAAGAAGTTA 1620 
CGCCCCTCAT TCCCAAACCG GTGG AGCAGT GATGAGTGTC TAAGGCAGAT GGGAAAACTC 1680 
ATG ACAGAAT GCTGGGCTCA CAATCCTGCA TCAAGGCTGA CAGCCCTGCG GGTTAAGAAA 1740 
ACACTTGCCA AAATGTCAGA GTCCCAGG AC ATTAAACTCXGATAGGAGAG GAAAAGTAAG 1800 
CATCTCTGCA GAAAGCCAAC AGGTACTCTT CTGTTTGTGG GCAGAGCAAA AGACATCAAA 1860 
TAAGCATCCA CAGTACAAGC CTTGAACATC GTOCTGCTTC CCAGTGGGTT CAGACCTCAC 1920 
CTTTCAGGGA GCGACCTGGG CAAAGACAGA GAAGCTCCCA GAAGGAGAGA TTGATCCGTG 1980 
TCTGTTTGTA GGCGGAGAAA CCGTTGGGTA ACTTGTTCAA GATATGATGC AT 

SEQ ID NO:183 BCY2 Proton sequence 

Protein Accession *: NP.001 1 94 



1 11 21 31 41 51 
I I I I I I 

MIXRSAGKLN VGTKKEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFIMflEED 60 
DSGLPWTSG CLGLEGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHPTL PPLKNRDFVD 120 
GPIHHRALU S VTVCSLLLV LHLFCYFRY KRQETRPRYS IGLEQDETYI PPGESLRDLI 180 
EQSQSSGSGS GLPLLVQRT1 AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV KVOTTTsEAS 240 
WFRETEIYQT VLMRHENILG F1AADIKGTG SWTQLYLITD YHENGSLYDY LKSTTLDAKS 300 
MLKLAYSSVS GLCHLHTEIF STQGKPAIAH RDLKSKNILV KKNGTCCIAD LGLAVKHSD 360 
TNEVDIPPNT RVGTKRYMPP EVLDESLNRN HPQSYIMADM YSFGULWEV ARRCVSGGTV 420 
EEYQLPYHDL VPSDPSYEDM RETVCIKKLR PSFPNRWSSD ECLRQMGKLM TECWAHNPAS 480 
RLTALRVKKT LAKMSESQDI KL 



SEQ (D NO:184 CBF9 DNA sequence 

Nucleic Add Accession #: AC005383 

Coding Sequence: 328-2751 (underlined sequences correspond to start and stop codons) 



1 11 21 .31 41 51 



GACAGTGTTC GCGGCTGCAC CGCTCGGAGG CTGGGTGACC CGCGTAGAAG TGAAGTACTT 60 

TTTTATTTGC AGACCTGGGC CGATGCCGCT TTAAAAAACG CGAGGGGCTC TATGCACCTC 120 

CCTGGCGGTA GTTCCTCCGA CCTCAGCCGG GTCGGGTCGT GCCGCCCTCT CCCAGGAGAG 180 

ACAAACAGGT GTCCCACGTG GCAGCCGCGC CCCGGGCGCC CCTCCTGTGA TCCCGTAGCG 240 

CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT GAGTAGAGCC GCCCGGGCAC CGAGCGCTGG 300 

TCGCCGCTCT CCTTCCGTTA TATCAACATG CCCCCTTTCC TGTTGCTGGA GGCCGTCTGT 360 

GTTTTCCTG T TTTCCAGAGT GCCCCCATCT CTCCCTCTCC AGGAAGTCCA TGTAAGCAAA 420 

GAAACCATCG GGAAGATTTC AGCTGCCAGC AAAATGATGT GGTGCTCGGC TGCAGTGGAC 480 

ATCATGTTTC TGTTAGATGG GTCTAACAGC GTCGGGAAAG GGAGCTTTGA AAGGTCCAAG 540 

CACTTTGCCA TCACAGTCTG TGACGGTCTG GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GCATTCCAGT TCAGTTCCAC TCCTCATCTG GAATTCCCCT TGGATTCATT TTCAACCCAA 660 



376 



WO 02/30268 
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5 
.0 
.5 
JO 
15 
JO 
)5 
W 
15 
50 
55 
50 
65 



CAGGAAGTGA 
CTTGCTCTGA 
CAGATCCTCA 
CAGCTGAAGG 
GAGCTGCATG 
GAGGATGCCA 
ACGCCAGACT 
GAGTTCGCTG 
GCACACTGTC 
AGGACCACCT 
CCAGAAGGAC 
TGTGCCCTGA 
GCGGGCACCA 
GCCGTGCTGA 
CTGGTGGCGG 
GGCATTCCCT 
CGTGGCTTCG 
CTCACTGAGT 
GAGCTGCTCC 
GGCAGCCCAA 
GAGCTGCAGG 
CTCGTCTTCA 
AGCTTTGTGA 
CTGGTGGTGT 
GCTGCGATGC 
ACCGCCCTGC 
GTCCCCAAAG 
GCCCAGAAGC 
AGTGAGGGTC 
GCCGACCTGC 
CCAGTCAACC 
GGGAGCTACC 
TGGAGCTCTT 
ATGGCTCCCG 
GGCACTGAAA 
TTCCCGCCGT 
ATGCTGCTTA 
TTGATGTGTA 
CTGCCACCTT 

AGGCCTTTAC 
GCAGCTTTTC 
CTTGAGGGAC 
GGTCTCAGAC 
TGTGCATGGG 
ACCTTGAAGG 



AGGCAAGAAT 
AATACCTTCT 
TCATCGTCAC 
AAAGGGGTGT 
CACTGGCCAG 
CCAACGGCCT 
GCAGGGTCGA 
GCAATGCCCC 
CCTTCTACAG 
GCCCAGGCCC 
TGGACGGCTA 
AGCTGAGCCT 
CTCTGGACGG 
GCGAGGACTC 
TGCCTGTGGG 
TCCGTGGTGG 
GGAGCGCCAC 
CACACTCCGA 
TGCTGGGTGT 
AGCATGTGAT 
GGAAGCTGTG 
TGTTGGACAC 
GAAGCTGTGC 
ATGGCAGCCA 
TGCGGGCCAT 
TGCACATCTA 
CTGTGGTGGT 
TGAGGAACAA 
TGCGGAGGCT 
GGTACCACCA 
TCTGCAAACC 
GCTGCAAGTG 
GCTCTGTATG 
TGCAGGAGGG 
TGGTGCCTAC 
GGCCAGGACC 
GAGACAAGAA 
AGTAAATACC 
TCCCTTGAGG 
CACACAATCA 
TAGAGCATCC 
CACTTCCCCA 
GTTTGTGACT 
TGAATGTGAC 
CCCAGGTCTG 
TCTTC 



CAAGAGGATG 
GCACAGAGGG 
TGATGGGAAG 
CACTGTGTTT 
CGAGCCTAGA 
CTTCAGCACC 
GGCTCACCCC 
ATGCTGGAGA 
CTGGAAGAGA 
CTGTGACTCG 
CCAGTGCCTC 
GGAATGCAGG 
CTTCCTGCGG 
TCGGGCCCGA 
GGA6TACCAG 
CCCCACCCTG 
CAGGACAGGC 
GGATGAGGTT 
AGGCAGTGAG 
GGTCTACTCG 
CAGCCGGCAG 
CTCTGCCTCA 
CCTCCAGTTT 
GGTGCAGACT 
TAGCCAGGCC 
TGACAAAGTG 
GCTCACAGGC 
TGGCATCTCT 
TGCAGGTCCC 
GGACGTGCTC 
CAGCCCGTGC 
TCGGGATGGC 
TGTGAGCCAG 
CAGCAGCCGT 
CTTCTCGAAT 
ACTATTCTCA 
AGCAGCTGAT 
CACTTTCTGT 
ATAAACAAGG 
ATGCTCGCCA 
TTTGGACGGC 
GAGACATTCT 
TCTTGGCGAC 
CAATTAACCA 
GAGGGCCACG 



GTTTTCAAAG 
TTGCCTGGAG 
TCCCAGGGGG 
GCTGTGGGGG 
GGGCAGCACG 
CTCAGCAGCT 
TGTGAGCACA 
GGATCGCGGC 
GTGTTCCTAA 
CAGCCCTGCC 
TGCCCGCTGG 
GTCGACCTCC 
GCCAAAGTCT 
GTGGGTGTGG 
GATGTGCCTG 
ACGGGCAGTG 
CAGGAOCGGC 
GCGGGCCCAG 
GCCGTGCGGG 
GATCCTCAGG 
CGGCCAGGGT 
GTAGGGCCCG 
GAGGTGAACC 
GCCTTCGGGC 
CCCTACCTAG 
ATGACCGTCC 
GGGAGAGGCG 
GTCTTGGTCG 
CGGGATTCCC 
ATTGAGTGGC 
ATGAATGAGG 
TGGGAGGGCC 
GGATGGATTC 
ACCCCTCCCA 
GTCTGTGCCC 
CTGAGGGAGG 
GTCACCCACA 
ACCTGCTGTG 
GGTCCTGAAG 
GAATGTTGTT 
GAAGGCCACG 
GGATGCATTT 
TGCCTTTTGT 
GCTTGGTTGA 
TAAAATCGTT 



GAGGGCGCAC 
GCAGAAATGC 
ATGTGGCACT 
TCAGGTTTCC 
TGCTGTTGGC 
CGGCCATCTG 
GGACGCTGGA 
GGACCCTTGC 
CCCACCCTGC 
AGAATGGAGG 
CCTTTGGAGG 
TCTTCCTGCT 
TCGTGAAGCG 
CCACATACAG 
ACCTGGTCTG 
CCTTGCGGCA 
CACGTAGAGT 
CGCGTCACGC 
CAGAGCTGGA 
ATCTGTTCAA 
GCCGGACACA 
AGAATTTTGC 
CTGACGTGAC 
TGGACACCAA 
GTGGGGTGGG 
AGAGGGGTGC 
CAGAGGATGC 
TGGGCGTGGG 
TGATCCACGT 
TGTGTGGAGA 
GCAGCTGCGT 
CCCACTGCGA 
TTGAGACGCC 
GCAACTACAG 
CAGGTCCTTA 
AGGATGTCCC 
AACGATGTTG 
CCTTGTTGAG 
ACTTAAATTT 
GACACAGTAA 
GCCTTTCAAG 
GCATTGAGTC 
GTGTGGAAGA 
TGATGGGGGA 
CTGAGTCGTG 



GGAGACGGAA 
TTCTGTGCCC 
GCCATCCAAG 
CAGGTGGGAG 
TGAGCAGGTG 
CTCCAGCGCC 
GATGGTCCGG 
GGTGCTGGCT 
CACCTGCTAC 
CACATGTGTT 
GGAGGCTAAC 
GGACAGCTCT 
GTTTGTGCGG 
CAGGGAGCTG 
GAGCCTCGAT 
GGCGGCAGAG 
GGTGGTTTTG 
AAGGGCGCGA 
GGAGATCACA 
CCAAATCCCT 
AGCCCTGGAC 
TCAGATGCAG 
ACAGGTCGGC 
ACCCACCCGG 
CTCAGCCGGC 
CCGGCCTGGT 
AGCCGTTCCT 
GCCTGTCCTA 
GGCAGCTTAC 
AGCCAAGCAG 
CCTGCAGAAT 
GAACCGTGAG 
CCTGAGGCAC 
AGAAGGCCTG 
_GAATGTCTGC 
AACTGCAGCC 
TTGAAAAGTT 
GCTATGTCAT 
AGCGGCCTGA 
TGCCCAGCAG 
ATGGAAAGCA 
TGAAAGGGGG 
GACTTGGAAA 
GGGGCTGAGT 
AGCAGTGTCC 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



1 
I 

MPPFLLLEAV 
SVGKGSFERS 
MVFKGGRTET 
FAVGVRFPRW 
PCEHRTLEMV 
SQPCQNGGTC 
RAKVFVKRFV 
LTGSALRQAA 
EAVRAELEEI 
SVGPENFAQM 
APYLGGVGSA 
SVLWGVGPV 
CMNEGSCVLQ 
RTPPSNYREG 



11 

I 

CVFLFSRVPP 
KHFAITVCDG 
ELALKYLLHR 
EELHALASEP 
REFAGNAPCW 
VPEGLDGYQC 
RAVLSEDSRA 
ERGFGSATRT 
TGSPKHVHVY 
QSFVRSCALQ 
GTALLHIYDK 
LSEGLRRLAG 
NGSYRCKCRD 
LGTEMVPTFW 



21 
I 

SLPLQEVHVS 
LDISPERVRV 
GLPGGRNASV 
RGQHVLLAEQ 
RGSRRTLAVL 
LCPLAFGGEA 
RVGVATYSRE 
GQDRPRRVW 
SDPQDLFNQI 
FEVNPDVTQV 
VMTVQRGARP 
PRDSLIHVAA 
GWEGPHCENR 
NVCAPGP 



31 
I 

KETIGKISAA 
GAFQPSSTPH 
PQILIIVTDG 
VEDATNGLFS 
AAHCPFYSWK 
NCALKLSLEC 
LLVAVPVGEY 
LLTESHSEDE 
PELQGKLCSR 
GLWYGSQVQ 
GVPKAWVLT 
YADLRYHQDV 
EWSSCSVCVS 



41 

I 

SKMMWCSAAV 
LEFPLDSFST 
KSQGDVALPS 
TLSSSAICSS 
RVFLTHPATC 
RVDLLFLLDS 
QDVPDLVWSL 
VAGPARHARA 
QRPGCRTQAL 
TAFGLDTKPT 
GGRGAEDAAV 
LIEWLCGEAK 
QGWILETPLR 



SEP ID N0.-18S CBF9 Protein sequence 
Protein Accession*: none tad 



51 
I 

DIMFLLDGSN 
QQEVKARIKR 
KQLKERGVW 
ATPDCRVEAH 
YRTTCPGPCD 
SAGTTLDGFL 
DGIPFRGGPT 
RELLLLGVGS 
DLVFMLDTSA 
RAAMLRAISQ 
PAQKLRNNGI 
QPVNLCKPSP 
HMAPVQEGSS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



70 SEQ ID NO:186 PAV1 DNA sequence 

Nucleic Acid Accession*: AF272890 * 

Coding Sequence: 87-1520 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

75 | | | I I I 

TGCTACCCGC GCCCGGGCTT CTGGGGTGTT CCCCAACCAC GGCCCAGCCC TGCCACACCC 60 
CCCGCCCCCG GCCTCCGCAG CTCGGCATGG GCGCGGGGGT GCTCGTCCTG GGCGCCTCCG 120 
AGCCCGGTAA CCTGTCGTCG GCCGCACCGC TCCCCGACGG CGCGGCCACC GCGGCGCGGC 180 



377 



WO 02/30268 



PCT/US01/32045 



TGCTGGTGCC 
CGCTGTCTCA 
TCGTGGCGGG 
TCACCAACCT 
TGCCGTTCGG 
AGCTGTGGAC 
TTGCCCTGGA 
GCGCGCGGGC 
TGCCCATCCT 
ACCCCAAGTG 
CCTTCTACGT 
AGAAGCAGGT 
CGCCCTCGCC 
CCGCCGCCGC 
CGCGCCTCGT 
TCTTCACGCT 
AGCTGGTGCC 
TCAACCCCAT 
GCTGCGCGCG 
CGGGCTGTCT 
ACGACGATGT 
ACGGCGGGGC 
CCTCGGAATC 
GGGAACGAGG 
CCTCGTCTGA 
TTTGGGAAGG 



CGCGTCGCCG 
GCAGTGGACA 
CAATGTGCTG 
CTTCATCATG 
GGCCACCATC 
CTCAGTGGAC 
CCGCTACCTC 
GCGGGGCCTC 
CATGCACTGG 
CTGCGACTTC 
GCCCCTGTGC 
GAAGAAGATC 
CTCGCCCTCG 
CGCCGCCACC 
GGCCCTACGC 
CTGCTGGCTG 
CGACCGCCTC 
CATCTACTGC 
CAGGGCTGCC 
GGCCCGGCCC 
CGTCGGGGCC 
GGCGGCGGAC 
CAAGGT GTAG 
AGATCTGTGT 
ATCATCCGAG 
GATGGGAGAG 



CCCGCCTCGT 
GCGGGCATGG 
GTGATCGTGG 
TCCCTGGCCA 
GTGGTGTGGG 
GTGCTGTGCG 
GCCATCACCT 
GTGTGCACCG 
TGGCGGGCGG 
GTCACCAACC 
ATCATGGCCT 
GACAGCTGCG 
CCCGTCCCCG 
GCCCCGCTGG 
GAGCAGAAGG 
CCCTTCTTCC 

CGCAGCCCCG 
CGCCGGCGCC 
GGACCCCCGC 
ACGCCGCCCG 
AGCGACTCGA 
GGCCCGGCGC 
TTACTTAAGA 
GCAAAGAGAA 
TGGCTTGCTG 



TGCTGCCTCC 
GTCTGCTGAT 
CCATCGCCAA 
GCGCCGACCT 
GCCGCTGGGA 
TGACGGCCAG 
CGCCCTTCCG 
TGTGGGCCAT 
AGAGCGACGA 
GGGCCTACGC 
TCGTGTACCT 
AGCGCCGTTT 
CGCCCGCGCC 
CCAACGGGCG 
CGCTCAAGAC 
TGGCCAACGT 
TCAACTGGCT 
ACTTCCGCAA 
ACGCGACCCA 
CATCGCCCGG 
CGCGCCTGCT 
GCCTGGACGA 
GGGGCGCGGA 
CCGATAGCAG 
AAGCCACGGA 
ATGTTCCTTG 



CGCCAGCGAA 
GGCGCTCATC 
GACGCOGCGG 
GGTCATGGGG 
GTACGGCTCC 
CATCGAGACC 
CTACCAGAGC 
CTCGGOCCTG 
GGCGCGCCGC 
CATCGCCTCG 
GCGGGTGTTC 
CCTCGGCGGC 
GCCGCCCGGA 
TGCGGGTAAG 
GCTGGGCATC 
GGTGAAGGCC 
GGGCTACGCC 
GGCCTTCCAG 
CGGAGACCGG 
GGCCGCCTCG 
GGAGCCCTGG 
GCCGTGCCGC 
CTCCGGGCAC 
GTGAACTCGA 
CCGTTGCACA 
TTG 



AGCCCCGAGC 
GTGCTGCTCA 
CTGCAGACGC 
CTGCTGGTGG 
TTCTTCTGCG 
CTGTGTGTCA 
CTGCTGACGC 
GTGTCGTTCC 
TGCTACAACG 
TCCGTAGTCT 
CGCGAGGCCC 
CCAGCGCGGC 
CCCCCGCGCC 
CGGCGGCCCT 
ATCATGGGCG 
TTCCACCGCG 
AACTCGGCCT 
GGACTGCTCT 
CCGCGCGCCT 
GACGACGACG 
GCCGGCTGCA 
CCCGGCTTCG 
GGCTTCCCAG 
AGCCCACAAT 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



Protein Accession #: 



SEQ ID NO:187 PAV1 Protein sequence 
AA011176 



11 



21 



31 



41 



51 



MGAGVLVLGA 
MGLLMALIVL 
WGRWEYGSFF 
TVWAISALVS 
AFVYLRVFRE 
LANGRAGKRR 
FFNWLGYANS 
PPSPGAASDD 



LIVAGNVLVI 
CELWTSVDVL 
FLPILMHWWR 
AQKQVKKIDS 
PSRLVALREQ 
AFNPIIYCRS 
DDDDWGATP 



PLPDGAATAA 
VAIAKTPRLQ 
CVTASIETLC 
AESDEARRCY 
CERRFLGGPA 
KALXTLGIIM 
PDFRKAFQGL 
PARLLEPWAG 



RLLVPASPPA 
TLTNLFIMSL 
VIALDRYLAI 
NDPKCCDFVT 
RPPSPSPSPV 
GVFTLCWLPF 
LCCARRAARR 
CNGGAAADSD 



SLLPPASESP 
ASADLVMGLL 
TSPFRYQSLL 
NRAYAIASSV 
PAPAPPPGPP 
FLANWKAFH 
RHATHGDRPR 



EPLSQQWTAG 
WPFGATIW 
TRARARGLVC 
VSFYVPLCIM 
RPAAAAATAP 
RELVPDRLFV 
ASGCLARPGP 
FASESKV 



60 
120 
180 
240 
300 
360 
420' 



SEQ ID NO:188 BC02 DNA sequence 

Nucleic Acid Accession ft AJ400877 

Cooing sequence: 81-3080 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CCGCACTCCG CCGCCTCTGC 60 
CCGCAACCGC TGAGCCATCC ATGGGGGTCG CGGGCCGCAA CCGTCCCGGG GCGGCCTGGG 120 
CGGTGCTGCT GCTGCTGCTG CTGCTGCCGC CACTGCTGCT GCTGGCGGGG GCCGTCCCGC 1 80 
CGGGTCGGGG CCGTGCCGCG GGGCCGCAGG AGG ATGTAG A TGAGTGTGCC CAAGGGCTAG 240 
ATGACTGCCA TGCCGACGCC CTGTGTCAGA ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 
AGCCTGGCTA CCAAGGGGAA GGCAGGCAGT GTGAGGACAT CGATGAATGT GGAAATGAGC . 360 
TCAATGGAGG CTGTGTCCAT G ACTGTTTG A ATATTCCAGG CAATTATCGT TGCACTTGTT 420 
TTGATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATGTGGAC G AGTGCCTGG 480 
AGA ACAATGG CGGCTGCCAG CATACCTGTG TCAACGTCAT GGGGAGCTAT GAGTGCTGCT 540 
GCAAGGAGGG GTTTTTCCTG AGTGACAATC AGCACACCTG CATTCACCGC TCGGAAGAGG 600 
GCCTGAGCTG CATGAATAAG GATCACGGCT GTAGTCACAT CTGCAAGGAG GCCCCAAGGG 660 
GCAGCGTCGC CTGTG AGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AGAGACTGCA 720 
TCTTGACCTG TAACCATGGG AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 
GCCCAG AGTG CAGCTGCCAT CCACAGTAC A AGATGCACAC AGATGGGAGG AGCTGCCTTG 840 
AGCGAGAGGA CACTGTCCTG GAGGTGACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 
ATAAACGGGT GAAACGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GGAGGCTGTG 960 
ACCGCACCTG T A AG GAT ACT TCGACAGGTG TCCACTGCAG TTGTCCTGTT GGATTCACTC 1020 
TCCAGTTGGA TGGGAAGACA TGTAAAGATA TTGATGAGTG CCAGACCCGC AATGGAGGTT 1080 
GTGATCATTT CTGCAAAAAC ATCGTGGGCA GTTTTGACTG CGGCTGCAAG AAAGGATTTA 1 140 
AATTATTAAC AGATGAGAAG TCTTGCCAAG ATGTGGATGA GTGCTCTTTG GATAGGACCT 1200 
GTGACCACAG CTGCATCAAC CACCCTGGCA CATTTGCTTG TGCTTGCAAC CGAGGGTACA 1260 
CCCTGTATGG CTTCACCCAC TGTGGAGACA CCAATGAGTG CAGCATCAAC AACGGAGGCT 1320 
GTCAGCAGGT CTGTGTGAAC ACAGTGGGCA GCTATGAATG CCAGTGCCAC CCTGGGTACA 1380 
AGCTCCACTG GAATAAAAAA GACTGTGTGG AAGTGAAGGG GCTCCTGCCC ACAAGTGTGT 1440 
CACCCCGTGT GTCCCTGCAC TGCGGTAAGA GTGGTGGAGG AGACGGGTGC TTCCTCAGAT 1500 
GTCACTCTGG CATTCACCTC TCTTCAG ATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1560 
AGCTAAATGA AGGCAAGTGT AGTTTGAAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 
CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT CCGCTACGTA AACCTTACAT 1680 
GCAGCTCTGG CAAGCAAGTC CCAGGAGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 
TTATCACTGT TGAGTTTGAG CTTGAAACTA ACCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 
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TGAGCTGCAT CGTAAAGCGA ACCGAGAAGC GGCTCCGTAA AGCCATCCGC ACGCTCAGAA 1860 
AGGCCGTCCA CAGGGAGCAG TTTCACCTCC AGCTCTCAGG CATGAACCTC GACGTGGCTA 1920 
AAAAGCCTCC CAG AACATCT GAACGCCAGG CAGAGTCCTG TGGAGTGGGC CAGGGTCATG 1980 
CAGAAAACCA ATGTGTCAGT TGCAGGGCTG GGACCTATTA TGATGGAGCA CGAG AACGCT 2040 
GCATTTTATG TCCAAATGGA ACCTTCCAAA ATGAGGAAGG ACAAATGACT TGTGAACCAT 2100 
GCCCAAGACC AGGAAATTCT GGGGCCCTGA AGACCCCAGA AGCTTGGAAT ATGTCTGAAT 2160 
GTCG AGGTCT GTGTCAACCT GGTGAATATT CTGCAGATGG CTTTGCACCT TGCCAGCTCT 2220 
GTGCCCTGGG CACGTTCCAG CCTGAAGCTG GTCGAACTTC CTGCTTCCCC TGTGGAGGAG 2280 
GCCTTGCCAC CAAACATCAG GGAGCTACTT CCTTTCAGGA CTGTGAAACC AGAGTTCAAT 2340 
GTTCACCTGG ACATTTCTAC AACACCACCA CTCACCGATG TATTCGTTGC CCAGTGGGAA 2400 
CATACCAGCC TGAATTTGGA AAAAATAATT GTGTTTCTTG CCCAGGAAAT ACTACGACTG 2460 
ACTTTGATGG CTCCACAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGG 2520 
GAGATTTCAC TGGGTACATT GAATCCCCAA ACTACCCAGG CAATTACCCA GCCAACACCG 2580 
AGTGTACGTG GACCATCAAC CCACCCCCCA AGCGCCGCAT CCTGATCGTG GTCCXTGAG A 2640 
TCTTCCTGCC CATAGAGG AC GACTGTGGGG ACTATCTGGT G ATGCGG AAA ACCTCTTCAT 2700 
CCAATTCTGT GACAACATAT GAAACCTGCC AGACCTACGA ACGCCCCATC GCCTTCACCT 2760 
CCAGGTCAAA GAAGCTGTGG ATTCAGTTCA AGTCCAATG A AGGGAACAGC GCTAG AGGGT 2820 
TCCAGGTCCC ATACGTGACA TATGATGAGG ACTACCAGGA ACTCATTGAA GACATAGTTC 2880 
GAGATGGCAG GCTCTATGCA TCTGAGAACC ATCAGGAAAT ACTTAAGGAT AAGAAACTTA 2940 
TCAAGGCTCT GTTTGATGTC CTGGCCC ATC CCCAG AACTA TTTCAAGTAC ACAGCCCAGG 3000 
AGTCCCGAGA GATGTTTCCA AGATCGTTCA TCCG ATTGCT ACGTTCCAAA GTGTCCAGGT 3060 
TTTTGAGACC TTACAA ATGA CTCAGCCCAC GTGCCACTCA ATACAAATGT TCTGCTATAG 3120 
GGTTGGTGGG ACAGAGCTGT CTTCCTTCTG CATGTCAGCA CAGTCGGGTA TTGCTGCCTC 3180 
CCGTATC AGT GACTCATTAG AGTTCAATTT TTATAG ATAA TACAG ATATT TTGGTAAATT 3240 
GAACTTGGTT TTTCTTTCCC AGCATCGTGG ATGTAGACTG AGAATGGCTT TGAGTGGCAT 3300 
CAGCTTCTCA CTGCTGTGGG CGGATGTCTT GG ATAGATCA CGGGCTGGCT G AGCTGGACT 3360 
TTGGTCAGCC TAGGTGAGAC TCACCTGTCC TTCTGGGGTC TTACTCCTCC TCAAGGAGTC 3420 
TGTAGTGGAA AGGAGGCCAC AGAATAAGCT GCTTATTCTG AAACTTCAGC TTCCTCTAGC 3480 
CCGGCCCTCT CTAAGGGAGC CCTCTGCACT CGTGTGCAGG CTCTGACCAG GCAGAACAGG 3540 
CAAGAGGGGA GGGAAGGAGA CCCCTGCAGG CTCCCTCCAC CCACCTTGAG ACCTGGGAGG 3600 
ACTCAGTTTC TCCACAGCCT TCTCCAGCCT GTGTGATACA AGTTTGATCC CAGGAACTTG 3660 
AGTTCTAAGC AGTGCTCGTG AAAAAAAAAA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3720 
AGCACTTCTG GAGACAT 

SEQ ID HO:189 BC02 Protein sequence 

Protein Accession #: CAB92285 



1 11 21 31 41 51 
I I I I I I 

MGVAGRNRPG AAWAVLLLLL LLPPLLLLAG AVPPGRGRAA GPQEDVDECA QGLDDCHADA 60 
LCQNTPTSYK CSCKPG YQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCHX5FMLA 120 
HDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTCIHR SEEGLSCMNK 180 
DHGCSHICKB APRGSVACEC RPGFELAKNQ RDCILTCNHG NGGCQHSCDD TADGPBCSCH 240 
FQYKMHTDGR SCLEREDTVL EVTESNTTSV VDGDKRVKRR LLMETCAVNN GGCDRTCKDT 300 
STGVHCSCPV GFILQLDGKT CKDIDECQTR NGGCDHFCKN IVGSFDCGCK KGFKIXTDEK 360 
SCQDVDECSL DRTCDHSCIN HPGTFACACN RGYTLYGFTH CGDTNECSIN NGGCQQVCVN 420 
TVGSYECQCH PGYKLHWNKK DCVEVKGLLP TSVSPRVSLH CGKSGGGDGC FLRCHSGIHL 480 
SSDVTTTRTS VTFKLNEGKC SLKNAELFPE GLRPALPEKH SSVKESFRYV NLTCSSGKQV 540 
PGAPGRPSTP KEMHTVEFE LETNQKEVTA SCDLSCIVKR TEKRLRKAIR TLRKAVHREQ 600 
FHLQLSGMNL DVAKKPPRTS ERQAESCGVG QGHAENQCVS CRAGTYYDGA RERCILCPNG 660 
TFQNEEGQMT CEPCPRPGNS GALKTPEAWN MSECGGLCQP GEYSADGFAP CQLCALGTFQ 720 
PEAGRTSCFP CGGGLATKHQ GATSPQDCET RVQCSPGHFY NTTTHRCIRC PVGTYQPEFG 780 
KNNCVSCPGN TTTDFDGSTN ITQCKNRRCG GELGDFTGYI ESPNYPGNYP ANTECTWTIN 840 
PPPKRRILIV VPEIFLPIED DCGDYLVMRK TSSSNSVTTY ETCQTYERPI AFTSRSKKLW 900 
IQFKSNEGNS ARGFQVPYVT YDEDYQEUE DIVRDGRLYA SENHQEILKD KKUKALFDV 960 
LAHPQNYFKY TAQESREMFP RSHRLLRSK VSRFLRPYK 

S^1PNW9PBFQ1 fiW\ sequence 

Nudeic Acid Accession #; AF007170 

Coding sequence: 1-1725 (underlined sequences correspond to stop codon) 

1 U 21 31 41 51 

AAGGAGGCGG CCTCCGGGAA AAGCG ACCGC AGG ACTCCTG AGAGCAGCCT CCATGAGGCC 60 
CTGGACCAGT GCATGACCGC CCTGGACCTC TTCCTCACCA ACCAGTTCTC AGAAGCACTC 120 
AGCTACCTCA AGCCCAGAAC CAAGGAAAGC ATGTACCACT CACTGACATA TGCCACCATC 180 
CTGGAGATGC AGGCCATGAT G ACCTTTGAC CCTCAGGACA TCCTGCITGC CGGCAACATG 240 
ATGAAGGAGG CACAGATGCT GTGTCAGAGG CACCGGAGGA AGTCTTCTGT AACAGATTCC 300 
TTCAGCAGCC TGGTGAACCG CCCCACGCTG GGCCAATTCA CTGAAGAAGA AATCCACGCT 360 
GAGGTCTGCT ATGCAG AGTG CCTGCTGCAG CGAGCAGCCC TGACCITCCT GCAGGACG AG 420 
AACATGGTGA GCTTCATCAA AGGCGGCATC AAAGTTCGAA ACAGCTACCA GACCTACAAG 480 
GAGCTGGACA GCCTTGTTCA GTCCTCACAA TACTGCAAGG GTGAGAACCA CCCGCACTTT 540 
GAAGGAGGAG TGAAGCTTGG TGTAGGGGCC TTCAACCTG A CACTGTCCAT GCTTCCTACT 600 
AGGATCCTG A GGCTGTTGGA GTTTGTGGGG TTTTCAGGAA ACAAGGACTA TGGGCTGCTG 660 
CAGCTGGAGG AGGGAGCGTC AGGGCACAGC TTCCGCTCTG TGCTCTGTGT CATGCTCCTG 720 
CTGTGCTACC ACACCTTCCT CACCTTCX3TG CTCGGTACTG GGAACGTCAA CATCG AGG AG 780 
GCXX3AGAAGC TCTTG AAGCC CTACCTGAAC CGGTACCCTA AGGGTGCCAT CTTCCTGTTC 840 
TTTGCAGGGA GGATTGAAGTCATTAAAGGC AACATTGATG CAGCCATCCG GCGTTTCGAG 900 
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GAGTGCTGTG AGGCCCAGCA GCACTGG AAG CAGTTTCAOC ACATGTGCTA CTGGGAGCTG 960 
ATGTGGTGCT TCACCTACAA GGGCCAGTGG AAGATGTCCT ACTTCTACGC CGACCTGCTC 1020 
AGCAAGG AGA ACTGCTGGTC CAAGGCCACC TACATTTACA TGAAGGCCGC CTACCTCAGC 1080 
ATGTTTGGGA AGGAGGACCA CAAGCCGTTC GGGGACGACG AAGTGGAATT ATTTCGAGCT 1 140 
GTGCCAGGCC TGAAGCTCAA GATTGCTGGG AAATCTCTAC CCACAGAGAA GTTTGCCATC 1200 
CGG AAGTCCC GGCGCTACTT CTCCTCCAAC CCTATCTCGC TGCCAGTGCC TGCTCTGG AA 1260 
ATGATGTACA TCTGGAACGG CTACGCCGTG ATTGGGAAGC AGCCGAAACT CACGGATGGG 1320 
ATACTTGAGA TTATCACTAA GGCTGAAGAG ATGCTGGAGA AAGGOCCAGA GAACGAGTAC 1380 
TCAGTGGATG ACGAGTGCTT GGTGAAATTG TTG AAAGGCC TGTGTCTGAA ATACCTGGGC 1440 
CGTGTCCAGG AGGCCG AGGA GAATTTTAGG AGCATCTCTG CCAATGAAAA GAAGATTAAA 1500 
TATGACCACT ACTTGATCCC AA ACGCCCTG CTGGAGCTGG CCCTGCTGCT TATGGAGCAA 1560 
GACAGAAACG AAG AGGCCAT CAAACTTTTG GAATCTGCCA AGCAAAACTA CAAGAATTAC 1620 
TCCATGGAGT CAAGGACACA CTTTCGAATC CAGGCAGCCA CACTCCAAGC CAAGTCTTCC 1680 
CTAGAGAACA GCAGCAGATC CATGGTCTCA TCAGTGTCCT TGTAGCTTTG TGCAGCAGTT 1740 
CCGGGCTGGA AGACAGAGAC AGCTGGACAG AGCTCCTGAA AACATTTCAA AATACCCCCT 1800 
CCCCCTGCCC TGCCCTGCCT TTGGGGTCCA CCGGCACTCC AGTTGGATGG CACAACATAG 1860 
TGTATCCGTG CAGAAGCCGA GCTGGCATTT TCACCAGTGT AGCCAAGGGC CTTTGCCAAG 1920 
GGCAGAGCAG GTGG AGCCCT CTGCCTGCCC TATCACACAT ACGGGTACTT GCTTTTCACT 1980 
GTGATGTTTA AGAG AATGTA TGAACAGTTT ACATTTTCCT TAGAAATACA TTGATGGGAT 2040 
CACAGTTGGC TTTAAAAACC AACAACAATC AACCACCTGT AAGTCTTTGT CTICACCTAT 2100 
TATCATCTGG AGGTAAATCT CTTTATATGA TGATGCCAAA GGGCAAATTG CTTTTCAAAT 2160 
TCAGCAAGTT CTCAGCTTGT GTG ACGG AAG GTCCTTCAGA GGACCTGAGG AATGCCTGGG 2220 
AG AGGCTAAG CCTCAGGCTT CAATGCTTCT GGGGTTGGGC ATGAGGATGT ACACAGACAC 2280 
CCACTACCTT ACTACTC ACA CTTC ATTTCA CTCCTTTTGT AAATTTCCAA TTTAAAAATC 2340 
AAGCACGTCT TTTTAGTGAG ATAAAATCTG AGCTCTTCTG TAG AAAAATC AATCTCTACC 2400 
AGTAG AA AAT GCCAGGGCTT GATGG AAG AG CTGTGTAGCC CTTTCTATGC CAAAGCCAGG 2460 
AAATITGGGG GGCAGGAGGA GGTTCTCAGA ATCCAGTCTG TATCTTTGCT GTATGCCAAA 2520 
CTGAAACCAC TGGGAATAAT TTATGAAACA TAAAAATCTT CTGTACTTCA CTCCAAGGTA 2580 
CATTTGCTTA CTGACAGCAT TTTTGTTAAA ACTGTTATTC TTGAAAAAAA AAAAAAAAAA 2640 
AA 

$EQfDHP;191PFQ1 EBaaLSMUSQCfi 

Protein Accession* AAC39582 



1 11 21 31 41 51 
I I I I I I 

MTALDLFLTN QFSEALSYLK PRTKESMYHS LTYATILEMQ AMMTFDPQDI LLAGNMMKEA 60 
QMLCQRHRRK SSVTDSFSSL VNRPTLGQFT EEEMAEVCY AECLLQRAALTFLQDENMVS 120 
FIKGGIKVRN S YQTYKELDS LVQSSQYCKG ENHPHFEGGV KLG VGAFNLT LSMLPTRILR 180 
LLEFVGESGN KDYGLLQLEE GASGHSFRSV LCVMLLLCYH TFLTFVLGTG NVNIEEAEKL 240 
LKPYLNRYPK GAJFLFFAGR IEV1KGNIDA AIRRFEECCE AQQHWKQFHH MCYWELMWCF 300 
TYKGQWKMSY FYADLLSKEN CWSKATYTYM KAAYLSMFGK EDHKPFGDDE VEUFRAVPGL 360 
KLKIAGKSLP TEKFAERKSR R YFSSNPISL PVPALEMMYI WNG YA VIGKQ PKLTDGILEI 420 
ITKAEEMLEK GPENEYSVDD ECLVKLLKGL CLKYLGRVQE AEENFRSlSA NEKKIKYDHY 480 
UPN ALLELA LLLMEQDRNE EAIKLLES AK QNYKNYSMES RTHFRIQAAT LQAKSSLENS 540 
SRSMVSSVSL 



SEP IP NO:192 BF06 DNA seouence 

Nucleic Add Accession*: NM_032S83 

Coding sequence: 1-4044 (underlined sequences correspond to start and stop code as) 

1 11 21 31 41 51 
I I I I I I 

ATGACTAGGA AGAGGACATA CTGGGTGCCC AACTCTTCTG GTGGCCTCGT GAATCGTGGC 60 
ATCG ACATAG GCG ATG ACAT GGTTTCAGG A CTTATTTATA AAACCTATAC TCTOCAAGAT 120 
GGCCCCTGGA GTCAGCAAGA GAGAAATCCT GAGGCTCCAG GGAGGGCAGC TGTCCCACCG 180 
TGGGGGAAGT ATGATGCTGC CTTGAG AACC ATGATTCCCT TCCGTCCCAA GCCGAGGTTT 240 
CCTGCCCCCC AGCCCCTGG A CAATGCTGGC CTGTTCTCCT ACCTCACCGT GTCATGGCTC 300 
ACCCCGCTCA TGATCCAAAG CTTACGGAGT CGCTTAGATG AGAACACCAT CCCTCCACTG 360 
TCAGTCCATG ATGCCTCAGA CAAAAATGTC CAAAGGCTTC ACCGCCTTTG GG AAGAAGAA 420 
GTCTCAAGGC GAGGGATTGA AAAAGCTTCA GTGCTTCTGG TGATGCTGAG GTTCCAGAG A 480 
ACAAGGTTGA TTTTCGATGC ACTTCTGGGC ATCTGCTTCT GCATTGCCAG TGTACTCGGG 540 
CCAATATTGA TTATACCAAA GATCCTGGAA TATTCAGAAG AGCAGTTGGG GAATGTTGTC 600 
CATGGAGTGG GACTCTGCTT TGCC C1 1111 CTCTCCGAAT GTGTGAAGTC TCTGAGTTTC 660 
TCCTCCAGTT GGATCATCAA CCAACGCACA GCCATCAGGT TCCGAGCAGC TGTTTCCTCC 720 
TTTGCCTTTG AGAAGCTCAT CCAATTTAAG TCTGTAATAC ACATCACCTC AGGAGAGGCC 780 
ATCAGCTTCT TCACCGGTGA TGTAAACTAC CTGTTTGAAG GGGTGTGCTA TGGACCCCTA 840 
GTACTG ATCA CCTGCGCATC GCTGGTCATC TGCAGCATTT CTTCCTACTT CATTATTCGA 900 
TACACTGCAT TTATTGCCAT CTTATGCTAT CTCCTGGTTT TCCCACTGGC GGT ATTCATG 960 
ACAAGAATGG CTGTGAAGGC TCAGCATCAC ACATCTGAGG TCAGCGACCA GCGCATCCXpT 1020 
GTGACCAGTG AAGTTCTCAC TTGCATTAAG CTGATTAAAA TGTACACATG GGAGAAACCA 1080 
TTTGCAAAAA TCATTGAAGG TATGGAAAGT CTGACTTTCT GCTCCAAACC TGGTGATGGC 1 140 
ATGGCCTTCA GCATGCTGGC CTCCTTGAAT CTCCTTCGGC TGTCAGTGTT CTTTGTGCCT 1200 
ATTGCAGTCA AAGGTCTCAC GAATTCCAAG TCTGCAGTGA TGAGGTTCAA G AAG 1 Till C 1260 
CTOCAGGAGA GCCCTGTTTT CTATGTCCAG ACATTACAAG ACCCCAGCAA AGCTCTGGTC 1320 
TTTGAGGAGG CCACCTTGTC ATGGCAACAG ACCTGTCCCG GGATCGTCAA TGGGGCACTG 1380 
GAGCTGGAGA GGAACGGGCA TGCTTCTGAG GGGATGACCA GGCCTAGAGA TGCCCTCGGG 1440 
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CCAGAGGA AG AAGGGAACAG CCTGGGCCCA G AGTTGCACA AGATCAACCT GGTGGTGTCC 1500 
AAGGGGATGA TGTTAGGGGT CTGCGGCAAC ACGGGGAGTG GTAAGAGCAG CCTGTTGTCA 1560 
GCCATCCTGG AGGAGATGCA CTTGCTOGAG GGCTCGGTGG GGGTGCAGGG AAGCCTGGCC 1620 
TATGTCCCCC AGCAGGCCTG GATCGTCAGC GGG AACATC A GGGAGAACAT CCTCATGGG A 1680 
GGCGCATATG ACAAGGCCCG ATACCTCCAG GTGCTCCACT GCTGCTCCCT GAATCGGGAC 1740 
CTGGAACTTC TGCCCTTTGG AGACATGACA GAGATTGGAG AGCGGGGCCT CAACCTCTCT 1800 
GGGGGGCAGA AACAGAGGAT CAGCCTGGOC CGCGCCGTCT ATTCCGACCG TCAGATCTAC 1 860 
CTGCTGGACG ACCCCCTGTC TGCTGTGGAC GCCCACGTGG GGAAGCACAT TTTTGAGGAG 1920 
TGCATTAAGA AGACACTCAG GGGGA AGACG GTCGTCCTGG TGACCCACCA GCTGCAGTAC 1980 
TTAGAATTTT GTGGCCAGAT CATTTTGTTG GAAAATGGGA AAATCTGTGA AAATGGAACT 2040 
CACAGTGAGT TAATGCAGAA AAAGGGG AAA TATGCCCAAC TTATCCAG AA GATGCACAAG 2100 
GAAGCCACTT CGGACATGTT GCAGGACACA GCAAAGATAG CAGAGAAGCC AAAGGTAGAA 2160 
AGTCAGGCTC TGGCCACCTC CCTGGAAG AG TCTCTCAACG GAAATGCTGT GCCGGAGCAT 2220 
CAGCTCACAC AGGAGGAGGA GATGGAAGAA GGCTCCTTGA GTTGGAGGGT CTACCACCAC 2280 
TACATCCAGG CAGCTGGAGG TTACATGGTC TCTTGCATAA TTTTCTTCTT CGTGGTGCTG 2340 
ATCG7CTTCT TAACGATCTT CAGCTTCTGG TGGCTGAGCT ACTGGTTGG A GCAGGGCTCG 2400 
GGGACCAATA GCAGCCGAGA GAGCAATGGA ACCATGGCAG ACCTGGGCAA CATTGCAGAC 2460 
AATCCTCAAC TGTCCTTCTA CCAGCTGGTG TACGGGCTCA ACGCCCTGCT CCTCATCTGT 2520 
GTGGGGGTCT GCTCCTCAGG GATTTTCACC AAAGTCACGA GGAAGGCATC CACGGCCCTG 2580 
CACAACAAGC TCTTCAACAA GGTTTTCCGC TGCCCCATGA GTTTCTTTGA CACCATCCCA 2640 
ATAGGCCGGC TTTTGAACTG CTTCGCAGGG GACTTGGAAC AGCTGGACCA GCTCTTGCCC 2700 
ATCTTTTCAG AGCAGTTCCT GGTCCTGTCC TTAATGGTGA TCGCCGTCCT GTTGATTGTC 2760 
AGTGTGC7GT CTCCATATAT CCTGTTAATG GGAGCCATAA TCATGGTTAT TTGCTTCATT 2820 
TATTATATGA TGTTCAAGAA GGCCATCGGT GTGTTCAAGA GACTGGAGAA CTATAGCCGG 2880 
TCTCCTTTAT TCTCCCACAT CCTCAATTCT CTGCAAGGCC TG AGCTCCAT CCATGTCTAT 2940 
GGAAAAACTG AAGACTTCAT CAGCCAGTTT AAGAGGCTG A CTGATGCGCA GAATAACTAC 3000 
CTGCTGTTGT TTCTATCTTC CACACGATGG ATGGCATTGA GGCTGGAGAT CATGACCAAC 3060 
CTIGTGACCT TGGCTGTTGC CCTGTTCGTG GCTTTTGGCA TTTCCTCCAC CCCCTACTCC 3120 
TTTAAAGTCA TGGCTGTCAA CATCGTGCTG CAGCTGGCGT CCAGCTTCCA GGCCACTGCC 3180 
CGGATTGGCT TGGAGACAGA GGCACAGTTC ACGGCTGTAG AGAGGATACT GCAGTACATG 3240 
AAGATGTGTG TCTCGG AAGC TCCTTTACAC ATGGAAGGCA CAAGTTGTCC CCAGGGGTGG 3300 
CCACAGCATG GGGAAATCAT ATTTCAGGAT TATCACATGA AATACAGAGA CAACACACCC 3360 
ACCGTGCTTC ACGGCATCAA CCTG ACCATC CGCGGCCACG AAGTGGTGGG CATCGTGGGA 3420 
AGG ACGGGCT CTGGGAAGTC CTCCTTGGGC ATGGCTCTCT TCCGCCTGGT GGAGCCCATG 3480 
GCAGGCCGG A TTCTCATTGA CGGCGTGGAC ATTTGCAGCA TCGGCCTGGA GGACTTGCGG 3540 
TCCAAGCTCT CAGTGATCOC TCAAG ATCCA GTGCTGCTCT CAGG AACCAT CAG ATTCAAC 3600 
CTAG ATCCCT TTG ACCX3TCA CACTGACC AG CAG ATCTGGG ATGCCTTGGA GAGG ACATTC 3660 
CTGACCAAGG CCATCTCAAA GTTCCCCAAA AAGCTGCATA CAGATGTGGT GGAAAACGGT 3720 
GGAAACTTCT CTGTGGGGGA GAGGCAGCTG CTCTGCATTG CCAGGGCTGT GCTTCGCAAC 3780 
TCCAAGATCA TCCTTATCGA TGAAGCCACA GCCTCCATTG ACATGGAGAC AGACACCCTG 3840 
ATCCAGCGCA CAATCCGTGA AGCCTTCCAG GGCTGCACCG TGCTCGTCAT TGCCCACCGT 3900 
GTCACCACTG TGCTGAACTG TGACCACATC CTGGTTATGG GCAATGGG AA GGTGGTAG AA 3960 
TTTGATCGGC CGGAGGTACT GCGGAAGAAG CCTGGGTCAT TGTTCGCAGC CCTCATGGCC 4020 
ACAGCCACTT CTTCACTGAG ATAAGGAGAT GTGG AGACTT CATGGAGGCT GGCAGCTGAG 4080 
CTCAGAGGTT CACACAGGTG CAGCTTCGAG GCCCACAGTC TGCGACCTTC TTGTTTGGAG 4140 
ATGAGAACTT CTCCTGGAAG CAGGGGTAAA TGTAGGGGGG GTGGGGATTG CTGGATGGAA 4200 
AOOCTGGAAT AGGCTACTTG ATGGCTCTCA AG ACCTTAGA ACCOCAGAAC CATCTAAGAC 4260 
ATGGG ATTCA GTGATCATGT GGTTCTCCTT TTAACTTACA TGCTGAATAA TTTTATAATA 4320 
AGGTAAAAGC TTATAGTTTT CTG ATCTGTG TTAGAAGTGY TGCAAATGCT GTACTGACTT 4380 
TGTAAAATAT AAAACTAAGG AAAACTCAAA AAAAAAAAAA AAAAAAA 

Protein Accession*: NPJ1S972.1 

1 11 21 31 41 51 
I I I I I I 

MTRKRTYWVP NSSGGLVNRG IDIGDDMVSG LIYKTYTLQD GPWSQQERNP EAPGRAAVPP 60 
WGKYDAALRT MIPFRPKPRF PAPQPLDNAG LFS YLTVS WL TPLMIQSLRS RLDENTIPPL 120 
SVHDASDKNV QRLHRLWEEE VSRRGIEKAS VLLVMLRFQR TRLIFDALLG ICFCIASVLG 180 
PILUPKILE YSEEQLGNW HGVGLCFALF LSECVKSLSFSSSWHNQRTAIRFRAA VSS .240 
FAFEKLIQFK SVIHTTSGEA ISFFTGDVNY LFEGVCYGPL VUTCASLVI CSISSYFUG 300 
YTAFIAILCY IXVFPLAVFM TRMAVKAQHH TSEVSDQRIR VTSEVLTCIK LIKMYTWEKP 360 
FAK1IEGMES LTTCSKPGDG MAFSMLASLN LLRLS VFFVP IAVKGLTNSK SAVMRFKKFF 420 
LQESPVFYVQ TUQDPSKALV FEEATLS WQQ TCPGIVNGAL ELERNGHASE GMTRPRD ALG 480 
PEEEGNSLGP ELHKINLWS KGMMLGVCGN TGSGKSSLLS AILEEMHLLE GS VGVQGSLA 540 
YVPQQAWIVS GNIRENILMG GAYDKARYLQ VLHCCSLNRD LELLPFGDMT HGERGLNLS 600 
GGQKQRBLA RAVYSDRQIY LLDDPLSA VD AHVGKHIFEE CIKKTLRGKT WLVTHQLQY 660 
LEFOGQIILL ENGKICENGT HSELMQKKGK YAQIiQKMHK EATSDMLQDT AKIAEKPKVE 720 
SQALATSLEE SLNGNAVPEH QLTQEEEMEE GSLSWRVYHH YIQAAGGYMV SCIIFFFWL 780 
IVFLTIFSFW WLSYWLEQGS GTNSSRESNG TMADLGNIAD NPQLSFYQLV YGLNALLUC 840 
VGVCSSGBFT KVTRKASTAL HNKLFNKVFR CPMSFFDTIP IGRLLNCFAG DLEQIDQLLP 900 
IFSEQFLVLS LMVIAVLUV SVLSPY1LLM GAUMVICH YYMMFKKAIG VFKRLENYSR 960 
SPLFSHILNS LQGLSSIHVY GKTEDFISQF KRLTDAQNNY IXLFLSSTRW MALRLE1MTN 1020 
LVTLAVALFV AFGISSTPYS FKVMAVNIVL QLASSFQATA RIGLETEAQF TAVER1LQYM 1080 
KMCVSEAPLH MEGTSCPQGW PQHGEUFQD YHMKYRDNTP TVLHGINLTI RGHEWGIVG 1 140 
RTGSGKSSLG MALFRLVEPM AGRHJDG VD ICSIGLEDLR SKLSVIPQDP VLLSGTIRFN 1200 
LDPFDRHTDQ QIWDAUERTF LTKAISKFPK KLHTDV VENG GNFS VGERQL LCIARAVLRN 1260 
SKULIDEAT ASIDMETDTL IQRTIREAFQ GCTVLVIAHR VTTVLNCDHI LVMGNGKWE 1320 
FDRPEVLRKK PGSLFAALMA TATSSLR 
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gEQIPNfriMPHHPNArawnw 
Nucleic Acid Accession* AA983251 
5 Coding sequence: 1-1749 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I i I 

ATGCTGTCTG GCTTCTTGAT GAGTCCCAGT ACCCAGCACA GAGCACAGTA CACTCCCGGA 60 

10 GGAAAGAAAC TTCCGTGGGA GGCTTCCATC GGTGCGCACA CCTCCCGAGG GCGAGGCAGC 120 

GACCGGGAGA GGGAGAGCCG GCCGGAGGCT GCCGGGCTCC TGTGGGACCG CGCTGCAGCC 180 

GGGGAGGCGG AGAAGGGGAA CCGGGGCGAG CCGCCCGCCT GGATCCGCGC CCAGCAGCAG 240 

CCGCGGCCGC CGCCAGCTGG GCAGGCTCCC GGGACTGCGG CTGGGGGCGC GCAGGACCCT 300 

CGCCTGCGTC CTGGACGTTC CCGGGGGAGG GTCCGGTTGC CAGTGAAACC TCCAGAGGCT 360 

15 TCCGGACGAC AGCCCCGGGG GCCTTCTGAC TGCATCCCGA GATTTCCATC AGCGAGTGCA 420 

ACTCATAAGG CAGTCCCTAA GGGGACCGGG CCACCGGCTG AGGACGGGGA TGGCTTAGGA 480 

GCTCCTGGAC CTAGGGCCCG GCGTCGTCGC CTCCTGGGCG TCGCGGCAGA GGGGAGTGGC 540 

CCGCGCGGAA AGCGCCGCGG GACAGTCAGT GACGAGGCCC GGGGGTCGCC GGGGCCACGA 600 

CTTCTCGGAG ACCGTCCTGC GCTCTCTGGA GACGCGCTGT CCGCGCCCAG GGTGGTGCCA 660 

TGTGGGGCGC TCGCCGCTCG TCCGTCTCCT CATCCTGGAA CGCCGCTTCG CTCCTGCAGC 720 

TGCTGCTGGC TGCGCTGCTG GCGGCGGGGG CGAGGGCCCA GCGGCGAGTA CTGCCACGGC 780 

TGGCTGGACG CGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 840 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCTGCTC CAGCGCCGAG 900 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGGGCGCTGG CGAGCCTGGC 960 

CGGGCGGACA AAGACGGGCC CCGACGGCTC GGCAGGGCTT CATGTCTTAG GGGTACCCAA 1020 

GGAGACGGCG AGGGTGCGCC CCCACCCGTG AGGGCCTGGC AGCGGTGCTC CCCTGAAGGC 1080 

TCCCCGAAAG GAAGGCAGCT CCTCAGGGCT TTCCCGGGGC TGCTGCCCCG TGCCAGACGC 1140 

CGCGGATTCC CATCTTCTCC ACGCGGCGGC CCCTCTCCCC TGCAGCGGCC CGCCTTGCCC 1200 

ATCTACGTGC CGTTCCTCAT TGTTGGCTCC GTGTTTGTCG CCTTTATCAT CTTGGGGTCC 1260 

CTGGTGGCAG CCTGTTGCTG CAGATGTCTC CGGCCTAAGC AGGATCCCCA GCAGAGCCGA 1320 

GCCCCAGGGG GTAACCGCTT GATGGAGACC ATCCCCATGA TCCCCAGTGC CAGCACCTCC 1380 

CGGGGGTCGT CCTCACGCCA GTCCAGCACA GCTGCCAGTT CCAGCTCCAG CGCCAACTCC 1440 

GGGGCCCGGG CGCCCCCAAC AAGGTCACAG ACCAACTGTT GCTTGCCGGA AGGGACCATG 1500 

AACAACGTGT ATGTCAACAT GCCCACGAAT TTCTCTGTGC TGAACTGTCA GCAGGCCACC 1560 

CAGATTGTGC CACATCAAGG GCAGTATCTG CATCCCCCAT ACGTGGGGTA CACGGTGCAG 1620 

CACGACTCTG TGCCCATGAC AGCTGTGCCA CCTTTCATGG ACGGCCTGCA GCCTGGCTAC 1680 

AGGCAGATTC AGTCCCCCTT CCCTCACACC AACAGTGAAC AGAAGATGTA CCCAGCGGTG 1740 

ACTGTATAAC CGAGAGTCAC TGGTGGGTTC CTTTACTGAA GGGAGACGAA GGCAGGGGTG 1800 

GATTCTCGAG GTGGAAGTCC GCACATGTCG GTGGTATTTA TGGCACGATT CCTTTGGATG 1860 

GCTTCATTTG CCCCCAGACT GTATGAAAAC ATCTCCGAAT TAGCATTTCT GGATATGTTT 1920 

CATCCAGGGT ATCATTGATT TATGATGGAA AACCGGCCTC AGCTGGAGAT GACTGTGATG 1980 

TTGCTGATGG GTGTATAACA AATGCTTGAG TCCGAAGTGC CCTTGAGATA TGGTTGACGA 2040 

AAGAATTTTA TAAACTGATA AATTAAGGAT TTTTATTATG TTGTTATTAT TATTTCTTTT 2100 

TTGTTGTTGA CTGCACAGGA TCAAAATGCC TGTTATCTCC CTTTTACTGG GACTTTTTTT 2160 

TTTTTTTTTT TTTTTTTTAA TCAGACAGGG TCTTGCTCTG TTGCCCAGGC TGGAGTGCAG 2220 

TGGTGCGATC TCGGCTCACT GCAACTTCAG CCTCCTGGAT TCAGGCAACA CTCCTGCCTC 2280 

AGCCTCCCAC GTGGCTGGGA TTACAGGTGC CTGCCCCCAT GGCTAATTTT TTGTATTTTT 2340 

TGTAGAGATG GGGTTTCACC ATGTTGGCTG GGCTGGTCTC ACTCTCCTGA CCTCAAGCAA 2400 

TCTGCCTGTC TCAGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCACC GCCCCCAGCC 2460 

TGAGCCTTTT TTTTTTTCTA ATGCATCCAA GGTTAAGGGG AAGACGCAAA TAACAGGACT 2520 

ATTCTAAAAG GAAACCTGTT TGAACTCTGT GAGATCAGTC ATCAGTCTCA GTATTCCACA 2580 

GGCACACCTT AATTTCATTG TAAAAAGATA TATATATTTT GTCTATTTTT GTGCTTTTGG 2640 

GGGCCTATTT TGTGCTTTTT TACCTTATGT AGAGATCTTA TTACAAAGTG ATTTTCTACA 2700 

TTAAAAAGAG ACTGAAATAA ATTGTATAGT TACTTAACTA ATGAAGACAT TTCAGAACTC 2760 

55 TGGGATGATT TTAATCTTGA AGTAGTAGGT GGTATAGTCA TAAAACCATT CATCCCCTTC 2820 

TTGATTGTAT CTTAATTTTC TGGCTTTAAG GTGACATCTG AGAGGTAATG CATTCTTTTT 2880 

TATATTGAAA TCATAAACTA TCACCCGCTG CTTCTCTGAG TTACTTTTAA TTTTGCCTTG 2940 

TGGTTATGGT TTGGCGTTTC CTTCTGTTTG GTTTTCAGAG CCCCATGTCT ATATAGTCCT 3000 

GAGTGCAAGT AATTACTATA CTTGTAAATG AAGATCAGTA TTTCTGCCTA GATCTGATAA 3060 

AAAAATTTTC TTGTCTTAGT TATAAAAATT CAAAGAAATG TGTTACAAAG ATACTTAGTA 3120 

TAGCTCCTCA GCCATAACCT GAGACTTGGG ATGAAATTTA AACCAGATAC GATTTACTTT 3180 

GCAGATCATA AGGCTTTTTA TACTCTTGTT ATCAAAATGG CTTATTTTTC AGGCACTAAG 3240 

GATTGTTAAG AGAAAAGCTT TTCAACGAAG GATTGCCTTT CTTCTCCCAC ACTGTTCTTG 3300 

ATTTCCTCTC TCTTTCAGGC CTCAACAGGC ACTGTATTCA TTGCCAATGT TCCAAATTAT 3360 

65 CAAATTCAAG TGAATTTATT TGTGTGTTCT TTACTTATAT AAAAAAAGAT AACTTTAAGG 3420 

ATGTGCAAGT ACATTTCCAA CTGCTAGCAC AACCAGTATT TTGTAATTAA ACAAATCGCT 3480 

GTATGGTATG GTCTTCTACA CATTTATGTC TATAGATATC TATCGATCAT CTTTCTATTC 3540 

TGTTTCATGA CTGAATAATG TAAAACCAGT GTTGGCAATT GGTATCATCA ATGATACTCA 3600 

TTTTTTAATA ACCAAAGGCA GGGGAAAATC ATTTTACTTA TTAATAAATA TTTTATGATG 3660 
70 TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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Seq id nfriK BHP9 PwWn sequence 

Protein Accession #: none found 

1 ll 21 31 41 51 

I I I I I I 

MLSGFLHSPS TQHRAQYTPG GKKLPWEASI GAHTSRGRGS DRERESRPEA AGLLWDRAAA 60 

GEAEKGNRGE PPAVJIRAQQQ PRPPPAGQAP GTAAGGAQDP RLRPGRSRGR VRLPVKPPEA 120 

SGRQPRGPSD CIPRFPSASA THKAVPKGTG PPAEDGDGLG APGPRARRRR LLGVAAEGSG 1B0 

PRGKRRGTVS DEARGSPGPR LLGDRPALSG DALSAPRWP CGALAARPSP HPGTPLRSCS 240 

CCWLRCWRRG RGPSGEYCHG WLDAQGVWRI GFQCPERFDG GDATICCGSC ALRYCCSSAE 300 

ARUDQGGCDN DRQQGAGEPG RADKDGPRRL GRASCLRGTQ GDGEGAPPPV RAWQRCSPEG 360 

SPKGRQLLRA FPGLLPRARR RGFPSSPRGG PSPLQRPALP IWPFLIVGS VFVAFIILGS 420 

LVAACCCRCL RPKQDPQQSR APGGNRLMET IPMIPSASTS RGSSSRQSST AASSSSSANS 480 

GARAPPTRSQ TNCCLPEGTM NNVYVNMPTN FSVLNCQCAT QIVPHQGQYL HPPYVGYTVQ 540 
HDSVPKTAVP PFMDGLQPGY RQIQSPFPHT NSEQKMYPAV TV 

SEQ ID NO:196 CQA5 ONA SEQUENCE 

Nucleic Acid Accession*: AA088458 

Coding sequence: 862-1 995 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I 1 I 

GCCCTTGGAC ACTGACATGG ACTGAAGGAG TAGAATGGAG CACGAGGACA CTGACATGGA 60 

CTGAAGAAAA AGGAGCTGGA GCAGGAGAAG GAGGTGCTGC TGCAGGGTTT GGAGATGATG 120 

GCGCGGGGCC GCGACTGGTA CCAGCAGCAG CTGCAACGAG TGCAGGAGCG CCAGCGCCGC 180 

CTGGGCCAGA GCAGAGCCAG CGCCGACTTT GGGGCTGCAG GGAGCCCCCG CCCACTGGGG 240 

CGGCTACTGC CCAAGGTACA AGAGGTGGCC CGGTGCCTGG GGGAGCTGCT GGCTGCAGCC 300 

TGTGCCAGCC GGGCCCTGCC CCCGTCCTCC TCCGGGCCCC CCTGCCCTGC CCTGACGTCC 360 

ACCTCACCCC CGGTCTGGCA GCAGCAGACC ATCCTCATGC TGAAGGAGCA GAACCGACTC 420 

CTCACCCAGG AGGTGACCGA GAAGAGTGAG CGCATCACGC AGCTGGAGCA GGAGAAGTCG 480 

GCGCTCATTA AGCAGCTGTT TGAGGCCCGC GCCCTGAGCC AGCAGGACGG GGGACCTCTG 540 

GATTCCACCT TCATCTAGTC CTTGTGGGCC GCGTGGGCCC CCAGGGCCAG CCTGGCACTC 600 

AGCCCTTCGA GGGTGGGCGC CCCATCGCAC CCACCCTCTC TGGCTGGAGA CCCCCGGCAG 660 

GCCCAGGCAC AGTCCCGGAG TGGGCGCCTT CCTGCCGCCC TTGCCAGATG GGCTCCCCAG 720 

GCCTGCCCCC GGCTGGTCCC CGCACCGAGC GCTTGACTCC GTTTKGGCTC CTGGTTGVTG 780 

ACATGGGCTG GGGGCTCTCT TGAGTCCGCA TAGTCCGCAG CTACTACTGG CCGCTGTCAG 840 

TGGACAGTGG GGTACCCCTC CATGAGTTAG CGTCCCCCCG TTTCCAGCGG TGCCGCCCTG 900 

GGTCCCATCT TCAGGGAAAG GCACTGCCCA CGCCAGGCTG CACTTCCAAC AACGGGCAGC 960 

AGAGGGCGCG GGGCGGCTCC GACGCGGGTC CAAGGGCAGC TTCCCGCTCA ACCAGGGCAC 1020 

CAGGACGAGG TGGCTGTAGC TCGGACGGAC GGAAGTAGAT GGAGGGGGTG GGGACGGCCT 1080 

GTAAGCGGGG GGTGCCTGCC TGGCTGGGGA GCCCCAGGGA TAGCGGTCGG ACTTCAGGTT 1140 

CTGGCCAAGG CTGAGGGACC CTGGCTGCAG CGGATCGGCA CGCCGGGTGG GCGAGAGCTT 1200 

GGCCTGCATG TGCCTCCCAC AGACCCTGGG GTGATGGCCT TCCCCCTCTT GGCCGGGACG 1260 

TTCCCCCACG TTGAGTCCCA CACAACATCC TGTGAGCCTG GCTCCCCAGG AGGGCCCCCA 1320 

GACAGCTCCC AGGCACGTCA TAGGCAAAGC CTGTTTCCCC CGACTCAGGA TTTCCAAGGC 1380 

CTGGGGTCCT GCTCACCCCC CTTTGCTCTC ACGCCCAGCC TGTCCCCAGG TTTCAGCTGG 1440 

GAGAGGCCAC CTCCCTCAGC CAAGGAAAAC GAGAACOCOC AGGGTACAGG AGGAGGCTGG 1500 

GGCAGGTCCC CTTGGGTGTC ACTCCCTCAG CCCCTGCCCA GGCCCACTCC CGCTGGTGCT 1560 

GGAGTACGCA CTGGTGGGGG GGCCCTGCTC AGCCCAACCT GGAGGGTCCC AGTGTCACCA 1620 

GAACCAGGGG CACGGCAACA GCATCGATGG GTTCTGCAGC CCAGGGCCCC CGATGCGGGG 1680 

TCAGTGTGTG TGGGGCGCAG GGCCTCCGAT GCGGGGTCAG TGCGTGGGGG GCGCAGGGCC 1740 

CCCGATGCGG GGTCAGTGCG TGGGGGGCGC AGGGCCCCCT CGTGTCCAGG GCACTTTGGT 1800 

ACACTCTCCC ACAAGGCACC TGTCTCAGAG GAGGGGCCCT GGCAGGCAGC GTGGCAACTC 1860 

CCTTCCGGAG CCCAGCTCCA TGCTAACCTG CCCACAGCAA CCCCACAGAG CCACATTCCC 1920 

TGCTGCACCT GGTCTGCAGG GGTGTCCCAG GACAGGCCCA AGTCAGCCCA GCATGCAGCT 1980 

GCOCTCCTAC CCTGAAGATG GGAGTGGGCT TTCCAGGGGA CATARGGATG TCAGGCCTGG 2040 

ACCTCCTGGG CAGGAAAGGG TGCAGGTCCT GAGGGCCTGT GCCCCACAGC CCCAGCACCC 2100 

AGGTGGACTG CAGCGCAGTG GGTGGGCCAG TGGCAGGCAG GGAGAAGCCC CCCGTCAGCA 2160 

GGCTGGGGTC TGCCCACCAG GGCCTCCCCA CGTCTGCCTT TGAGGGTGCC TGCCATGCCC 2220 

TGGGGGATCC TGGCATCTTT ACTGGACTGG AAGCAGGAGA CAGAACAGTG TCTGTCCCGG 2280 

GGTGACTTCA TCAGGAGACC GCCCACATAG AGCTGGACCC CGCAGCTGAA GCGGAAATGT 2340 

GAGACAGGCT GGCACCTCCG GAAAAACTGC CTTTCAGCCT TGGTGTTCCG TGCAAGGTGA 2400 

AAAGAAATAG GTCCTCCCAG TTTACAGCTT GAAATCAGGC TAGTGAGTGG CCCTGGAGAC 2460 

CACGAGGGGA GAATTTAAAG GCCCCGGCTG GCAGGGTCTA GGTGGCTGGC AGAGGCACAT 2520 

GCAGACCCTG CCTGGAGCCT GCCCTAGGAC GCTGGGCGGG TCAGTCTCCG TGCAGGATGT 2580 

GAGCAGCGTC CCTGGGCTCT ATCCGCGAGG TGCCAGTAGC GTGTGCAGGT ACATACACGT 2640 

GCGTGCACAC TGTGATGACA CCCGGAAATG TCTCAGGATG TTGAAATGTG TCCTTGGGGG 2700 

CAGAAGTGTC CCCAGTTGAG AATCTGCCCC AGAGGAACAC ACCCACACCA GGCCTCAGGA 2760 

TTTTGTGTTG ATCAAGTTCC AAGGAAAAGG AACATCTCAG CCGGGCGTGG TGGTTCACGC 2820 

CTGGAATCCC AGCACTTGAG GCCAGGAGTT CCAGAGCAGC CTGGGCAACG CAGTGAGAGA 2880 

CCCCATCTCT ACAARAAAAA AAAAAGAAAG AAAGAAAATG AGAGATCCAG GTTTAAAAAT 2940 

TCATAAACAC CACAAGGAAA CAATACACTA TGAGACCCAG CAGAAGCAAC AGATTGACTC 3000 

TAGACCCAGA TACTAGAATT ATCAGAGAGA ATATAAAGTA ACAGTGTTTT ATATATCTAA 3060 
AGAAATAAAA GAGATTTCTG GAAACATGAA AAAAAA 
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SEQ ID K0:197 LBG2 DNA SEQUENCE 

Nucleic Add Accession #: X63829 

Coding sequence: 54-2543 (start and stop codons are underlined) 

1 II 21 31 41 51 
I I t I I I 

GCGGAACACC GGCCCGCCGT CGCGGCAGCT GCTTCACCCC TCTCTCTGCA GCCATGGGGC 60 
TCCCTCGTGG ACCTCTCGCG TCTCTCCTCC TTCTCCAGGT TTGCTGGCTG CAGTGCGCGG 120 
CCTCCGAGCC GTGCCGGGCG GTCTTCAGGG AGGCTGAAGT GACCTTGG AG GCGGGAGGCG 180 
CGGAGCAGGA GCCCGGCC AG GCGCTGGGG A AAGTATTCAT GGGCTGCCCT GGGCAAG AGC 240 
CAGCTCTGTT TAGCACTGAT AATG ATG ACT TCACTGTGCG G AATGGCGAG ACAGTCCAGG 300 
AAAGAAGGTC ACTG AAGGAA AGGAATCCAT TGAAGATCTT CCCATCCAAA CGTATCTTAC 360 
GAAGACACAA G AGAGATTGG GTGGTTGCTC CAATATCTGT CCCTGAAAAT GGCAAGGGTC 420 
CCTTCCCCCA G AGACTG AAT CAGCTC AAGT CTAATAAAGA TAGAGACACC AAGATTTTCT 480 
ACAGCATCAC GGGGCCGGGG GCAGACAGCC CCCCTGAGGG TGTCTTCGCT GTAGAGAAGG 540 
AGACAGGCTG GTTGTTGTTG AATAAGCCAC TGGACCGGGA GGAGATTGCC AAGTATGAGC 600 
TCTTTGGCCA CGCTGTGTCA GAGAATGGTG CCTCAGTGGA GGACCCCATG AACATCTCCA 660 
TCATCGTGAC CGACCAGAAT GACCACAAGC CCAAGTTTAC CCAGGACACC TTCCGAGGGA 720 
GTGTCTTAG A GGGAGTCCTA CCAGGTACTT CTGTG ATGCA GGTGACAGCC ACAGATG AGG 780 
ATG ATGCCAT CTACACCTAC AATGGGGTGG TTGCTTACTC CATCCATAGC CAAGAACCAA 840 
AGGACCCACA CGACCTCATG TTCACAATTC ACCGGAGCAC AGGCACCATC AGCGTCATCT 900 
CCAGTGGOCT GGAOCGGG AA AAAGTCCCTG AGTACACACT GACCATCCAG GCCACAG ACA 960 
TGGATGGGGA CGGCTCCACC AGCACGGCAG TGGCAGTAGT GGAGATCCTT GATGCCAATG 1020 
ACAATGCTCC CATGTTTGAC CCCCAGAAGT ACGAGGCCCA TGTGCCTGAG AATGCAGTGG 1080 
GCCATGAGGT GCAGAGGCTG ACGGTCACTG ATCTGGACGC CCCCAACTCA CCAGCGTGGC 1140 
GTGCCACCTA CCTTATCATG GGCGGTGACG ACGGGGACCA TTTTACXATC ACCACCCACC 1200 
CTGAGAGCAA OCAGGGCATC CTGACAACCA GGAAGGGTTT GGATTTTGAG GCCAAAAACC 1260 
AGCACACCCT GTACGTTGAA GTGACCAACG AGGCCCCTTT TGTGCTGAAG CTCCCAACCT 1320 
CCACAGCCAC CATAGTGGTC CACGTGGAGG ATGTGAATGA GGCACCTGTG TTTGTCCCAC 1380 
CCTCCAAAGT CGTTGAGGTC CAGGAGGGCA TCCCCACTGG GGAGCCTGTG TGTGTCTACA 1440 
CTGCAGAAGA CCCTGACAAG GAGAATCAAA AGATCAGCTA CCGCATCCTG AGAGACCCAG 1500 
CAGGGTGGCT AGCCATGGAC CCAGACAGTG GGCAGGTCAC AGCTGTGGGC ACCCTCGACC 1560 
GTGAGGATGA GCAGTTTGTG AGG AACAACA TCTATGAAGT CATGGTCTTG GCCATGGACA 1620 
ATGG AAGCCC TCCCACCACT GGCACGGGAA CCCTTCTGCT AACACTGATT GATGTCAACG 1680 
ACCATGGCCC AGTCCCTGAG CCCCGTCAGA TCACCATCTG CAACCAAAGC CCTGTGCGCC 1740 
ACGTGCTGAA CATCACGGAC AAGGACCTGT CTCCCCACAC CTCCCCTTTC CAGGCCCAGC 1800 
TCACAGATGA CTCAGACATC TACTGGACGG CAGAGGTCAA CGAGGAAGGT GACACAGTGG 1860 
TCTTGTCCCT GAAGAAGTTC CTGAAGCAGG ATACATATGA CGTGCACCTT TCTCTGTCTG 1920 
ACCATGGCAA CAAAGAGCAG CTGACGGTGA TCAGGGOCAC TGTGTGCGAC TGCCATGGCC 1980 
ATGTCGAAAC CTGCCCTGGA CCCTGGAAAG GAGGTTTCAT CCTCCCTGTG CTGGGGGCTG 2040 
TCCTGGCTCT GCTGTTCCTC CTGCTGGTGC TGCTTTTGTT GGTGAG AAAG AAGCGG AAGA 2100 
TCAAGGAGCC CCTCCTACTC CCAGAAGATG ACACCCGTGA CAACGTCTTC TACTATGGCG 2160 
AAGAGGGGGG TGGCGAAGAG GACCAGGACT ATGACATCAC CCAGCTCCAC CGAGGTCTGG 2220 
AGGCCAGGCC GGAGGTGGTT CTCCGCAATG ACGTGGCACC AACCATCATC CCGACACCCA 2280 
TGTACCGTCC TAGGCCAGCC AACCCAGATG AAATCGGCAA CTTTATAATT GAGAACCTGA 2340 
AGGCGGCTAA CACAGACCOC ACAGCCCCGC CCTACGACAC CCTCTTGGTG TTOGACTATG 2400 
AGGGCAGCGG CTCCGACGCC GCGTCCCTG A GCTCCCTCAC CTCCTCCGCC TCCGACCAAG 2460 
ACCAAG ATTA CGATTATCTG AACGAGTGGG GCAGCOGCTT CAAG AAGCTG GCAGACATGT 2520 
ACGGTGGCGG GG AGGACGAC TAGGCG GCCT GCCTGCAGGG CTGGGGACCA AACGTCAGGC 2580 
CACAGAGCAT CTCCAAGGGG TCTCAGTTCC CCCTTCAGCT GAGGACTTCG GAGCTTGTCA 2640 
GGAAGTGGCC GTAGCAACTT GGCGGAGACA GGCTATGAGT CTGACGTTAG AGTGGTTGCT 2700 
TCCTTAGCCT TTCAGG ATGG AGGAATGTGG GCAGTTTGAC TTCAGCACTG AAAACCTCTC 2760 
CACCTGGGCC AGGGTTGCCT CAGAGGCCAA GTTTCCAG AA GCCTCTTACC TGCCGTAAAA 2820 
TGCTCAACCC TGTGTCCTGG GCCTGGGCCT GCTGTGACTG ACCTACAGTG GACTTTCTCT 2880 
CTGGAATGGA ACCTTCTTAG GCCTCCTGGT GCAACTTAAT TTTTTTTTTT AATGCTATCT 2940 
TCAAAACGTT AGAGAAAGTT CTTCAAAAGT GCAGCCCAG A GCTGCTGGGC CCACTGGCCG 3000 
TCCTGCATTT CTGGTTTCCA GACCCCAATG CCTCCCATTC GGATGGATCT CTGCGTTTTT 3060 
ATACTGAGTG TGCCTAGGTT GCCCCTTATT TTTTATTTTC CCTGTTGCGT TGCTATAG AT 3120 
GAAGGGTGAG GACAATCGTG TATATGTACT AGAACTTTTT TATTAAAGAA A 



?EQ ID NEW LB(?2 PfQtefn sequence; 

Protein Accessions CAA45177 

1 11 21 31 41 51 
I I I I I I 

MGLPRGPLAS LLLLQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 
QEPALFSTDN DDFTVRNGET VQERRSLKER NPLKEFPSKR ILRRHKRDWV VAPtSVFENG 120 
KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEGVFAV EKETGWLLLN KPLDREEIAK 1 80 
YELFGHAVSE NG ASVEDPMN 1SIIVTDQND HKPKFTQDTF RGSVLEGVLP GTS VMQVTAT 240 
DEDDAIYTYN GWAYSIHSQ EPKDPHDLMF TTHRSTGTIS VISSGLDREK VPEVTLTIQA 300 
TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVGHEVQRLT VTDLDAPNSP 360 
AWRATYLIMG GDDGDHFTIT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 
PTSTATTWH VEDVNEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKE NQKISYPJLR 480 
DPAGWLAMDP DSGQVTAVGT LDREDEQFVR NNIYEVMVLA MDNGSPPTTG TGTIIXTUD 540 
VNDHGPVPEP RQITICNQSP VRHVLNITDK DLSPHTSPPQ AQLTDDSDIY WTAEVNEEGD 600 
TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGFILPVL 660 
GAVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDNVFY YGEEGGGEED QDYDITQLHR 720 
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GLEARPEWL RNDV APTHP TPMYRPRPAN PDEIGNFHE NLKAANTDPT APPYDTIXVF 780 
DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN BWGSRFKKLA DMYGGGEDD 

SEQ ID NO:1S9 OB15 DNA SEQUENCE 

Nucleic Acid Accessions NM.012152 

Coding sequence: 43-1 104 {underlined sequences correspond to start and stop codons) 



l 11 21 31 41 51 

I I I I I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA C AATG AATGA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCCGAT TTCTTCGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT CGCCGTGGAG 420 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCGCC ATTTTTATGG GGGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACGTCT TGTCTCCGCA TACAAGTGGG 720 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA CGGTGATGAC TGTCTTAGGG 780 

GCGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTCGACGG CCTGAACTGC 840 

AGGCAGTGTG GCGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 



SEQ |D NO;2flO CW Prfleln, sequence; 

Protein Accession*: NPJB6284 

1 11 21 31 41 51 

I I I I I I 

MNECHYDKHM DFFYNRSNTD TVDDWTGTKL VIVLCVGTFF CLFIFFSNSL VIAAVTKNRK 60 
FHFPFYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTVN RWFLRQGLLD SSLTASLTNL 120 
LVIAVERHMS IKRMRVHSNL TKKRVTLLIL LVWAIAIFKG AVPTLGWNCL CNI SACS SLA 180 
PIYSRSYLVF WTVSNLMAFL IMVWYLRIY VYVKRKTNVL SPHTSGSISR RRTFMKLMKT 240 
VMTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKRWFLL LALLNSWNP IIYSYKDEDM 300 
YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYEED SISQGAVCNK STS 

SEQIDNO:201 PAA6 DNA SEQUENCE 

Nucleic Acid Accession #: AA569531 

Coding sequence: 1-504 (underiined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGACCTACA GTTACTCATT TTTCAGGCCT GAGTTGATCG TTAATCATCT TAATTATGTT 60 

CATTCTGAAG CCAACAGGAG AACCAAGACC AAAACTTTAT TGTCTCTGCT TTCATTTCTT 120 

GATGAAACCT CTGGACTAAG CACACATCTT CCTTGTTTAT CTCTCTCAAA GGAGTGTGGA 180 

GTGCTTCATC TGGACATCCA CGGGAAGAAG GAAGACATGA GAATCACCCA ACAGTCTTCC 240 

CAGCTATACC TGTGGGACAT GGGTGGTTTT ACAATATTTA AGAACCTGTG GATGAGCCTC 300 

ATACCCAGAG GGAACAAACG CTCCCCAAAA AGAGTTACAG AAACCATCCT GAGAGATTTT 360 

AAGCAGAAGC AAAGTTCAAA GATCCAAGAG GAGAGACGAA GAGAGTCTGC AGGACCAAAC 420 

CTCTCTTCAT TCTGGTTTGT GGGGAATGCT GGAAGAGGAG ACAGGCCCCA GATTTGGGCA 480 

GGAAGTAAAC AGTTTTCAGG CTGAG GCCAA TCTGAGCAGG AACATTCCAA TATTTCTTCA 540 

GCTACGTTGT CCCAGCACTT CACTGGTTAA CCTTTTATGT CCACCAT1TG TGGATTTCAC 600 

AGCTACTTGT CAATGGTGAA TATTGATCAT CATCATTATC TACTGAGCTG CTACCATATG 660 

CCAGCTACTC CTTGCATGTT GTTCATTATT TTCTCAACAC TCAGCATATT TGCAATATGT 720 

TATGTAATAT CACAGACAAG GAAACTGAAC GCAGAAATGT TTTATTTCTT GCCAAACATC 780 

ACATGAGGAT GAACAATGAA ACCGATTTGA AACCAGGATT GTCTGATTCC AACATCTCTG 840 
GGTCCTTTTT CACTCTGATA TGCTGCAATT AAAAAGCCAT TTCTAAGACT GT 



SEQ ID NO:202 PAA6 Protein sequence: 
Protein Accession *: none found 

1 11 21 31 41 51 

I I I I I I 

MTYSYSFFRP ELXVNHLNYV HSEANRRTKT KTLLSLLSFL DETSGLSTHL PCLSLSKECG 60 
VLHLDIHGKK EDMRITQQSS QLYLWDMGGF TIFKNLWMSL IPRGNKRSPK RVTETILRDF 120 
KQKQSSKIQE ERRRESAGPN LSSFWFVGNA GRGDRPQIWA GSKQFSG 
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SEQ ID KO:203 PAB2 DMA SEQUENCE 

Nucleic Acid Accession*: XM.050197 

Coding sequence: 310-1 971 (underlined sequences correspond to start and stop codons) 

l * 11 21 31 41 51 

I I I I I I 

TCACACGTGC CAAGGGGCTG GCTCAGCGGA ACCAGCCTGC ACGCGCTGGC TCCGGGTGAC 60 

AGCCGCGCGC CTCGGCCAGG ATCTGAGTGA TGAGACGTGT CCCCACTGAG GTGCCCCACA 120 

GCAGCAGGTG TTGAGCATGG GCTGAGAAGC TGGACCGGCA CCAAAGGGCT GGCAGAAATG 180 

GGCGCCTGGC TGATTCCTAG GCAGTTGGCG GCAGCAAGGA GGAGAGGCCG CAGCTTCTGG 240 

AGCAGAGCCG AGACGAAGCA GTTCTGGAGT GCCTGAACGG CCCCCTGAGC CCTACCCGCC 300 

TGGCOCAC TA TGGTCCAGAG GCTGTGGGTG AGCCGCCTGC TGCGGCACCG GAAAGCCCAG 360 

CTCTTGCTGG TCAACCTGCT AACCTTTGGC CTGGAGGTGT GTTTGGCCGC AGGCATCACC 420 

TATGTGCCGC CTCTGCTGCT GGAAGTGGGG GTAGAGGAGA AGTTCATGAC CATGGTGCTG 480 

GGCATTGGTC CAGTGCTGGG CCTGGTCTGT GTCCCGCTCC TAGGCTCAGC CAGTGACCAC 540 

TGGCGTGGAC GCTATGGCCG CCGCCGGCCC TTCATCTGGG CACTGTCCTT GGGCATCCTG 600 

CTGAGCCTCT TTCTCATCCC AAGGGCCGGC TGGCTAGCAG GGCTGCTGTG CCCGGATCCC 660 

AGGCCCCTGG AGCTGGCACT GCTCATCCTG GGCGTGGGGC TGCTGGACTT CTGTGGCCAG 720 

GTGTGCTTCA CTCCACTGGA GGCCCTGCTC TCTGACCTCT TCCGGGACCC GGACCACTGT 780 

CGCCAGGCCT ACTCTGTCTA TGCCTTCATG ATCAGTCTTG GGGGCTGCCT GGGCTACCTC 840 

CTGCCTGCCA TTGACTGGGA CACCAGTGCC CTGGCCCCCT ACCTGGGCAC CCAGGAGGAG 900 

TGCCTCTTTG GCCTGCTCAC CCTCATCTTC CTCACCTGCG TAGCAGCCAC ACTGCTGGTG 960 

GCTGAGGAGG CAGCGCTGGG CCCCACCGAG CCAGCAGAAG GGCTGTCGGC CCCCTCCTTG 1020 

TCGCCCCACT GCTGTCCATG CCGGGCCCGC TTGGCTTTCC GGAACCTGGG CGCCCTGCTT 1080 

CCOCGGCTGC ACCAGCTGTG CTGCCCCATG CCCCGCACCC TGCGCCGGCT CTTCGTGGCT 1140 

GAGCTGTCCA GCTGGATGGC ACTCATGACC TTCACGCTGT TTTACACGGA TTTCGTGGGC 1200 

GAGGGGCTGT ACCAGGGCGT GCCCAGAGCT GAGCCGGGCA CCGAGGCCCG GAGACACTAT 1260 

GATGAAGGCG TTCGGATGGG CAGCCTGGGG CTGTTCCTGC AGTGCGCCAT CTCCCTGGTC 1320 

TTCTCTCTGG TCATGGACCG GCTGGTGCAG CGATTCGGCA CTCGAGCAGT CTATTTGGCC 1380 

AGTGTGGCAG CTTTCCCTGT GGCTGCCGGT GCCACATGCC TGTCCCACAG TGTGGCCGTG 1440 

GTGACAGCTT CAGCCGCCCT CACCGGGTTC ACCTTCTCAG CCCTGCAGAT CCTGCCCTAC 1500 

ACACTGGCCT CCCTCTACCA CCGGGAGAAG CAGGTGTTCC TGCCCAAATA CCGAGGGGAC 1560 

ACTGGAGGTG CTAGCAGTGA GGACAGCCTG ATGACCAGCT TCCTGCCAGG CCCTAAGCCT 1620 

GGAGCTCCCT TCCCTAATGG ACACGTGGGT GCTGGAGGCA GTGGCCTGCT CCCACCTCCA 1680 

CCCGCGCTCT GCGGGGCCTC TGCCTGTGAT GTCTCCGTAC GTGTGGTGGT GGGTGAGCCC 1740 

ACCGAGGCCA GGGTGGTTCC GGGCCGGGGC ATCTGCCTGG ACCTCGCCAT CCTGGATAGT 1800 

GCCTTCCTGC TGTCCCAGGT GGCCCCATCC CTGTTTATGG GCTCCATTGT CCAGCTCAGC 1860 

CAGTCTGTCA CTGCCTATAT GGTGTCTGCC GCAGGCCTGG GTCTGGTCGC CATTTACTTT 1920 

GCTACACAGG TAGTATTTGA CAAGAGCGAC TTGGCCAAAT ACTCAGC GTA GA AAACTTCC 1980 

AGCACATTGG GGTGGAGGGC CTGCCTCACT GGGTCCCAGC TCCCCGCTCC TGTTAGCCCC 2040 

ATGGGGCTGC CGGGCTGGCC GCCAGTTTCT GTTGCTGCCA AAGTAATGTG GCTCTCTGCT 2100 

GCCACCCTGT GCTGCTGAGG TGCGTAGCTG CACAGCTGGG GGCTGGGGCG TCCCTCTCCT 2160 

CTCTCCCCAG TCTCTAGGGC TGCCTGACTG GAGGCCTTCC AAGGGGGTTT CAGTCTGGAC 2220 

TTATACAGGG AGGCCAGAAG GGCTCCATGC ACTGGAATGC GGGGACTCTG CAGGTGGATT 2280 

ACCCAGGCTC AGGGTTAACA GCTAGCCTCC TAGTTGAGAC ACACCTAGAG AAGGGTTTTT 2340 

GGGAGCTGAA TAAACTCAGT CACCTGGTTT CCCATCTCTA AGCCCCTTAA CCTGCAGCTT 2400 

CGTTTAATGT AGCTCTTGCA TGGGAGTTTC TAGGATGAAA CACTCCTCCA TGGGATTTGA 2460 

ACATATGAAA GTTATTTGTA GGGGAAGAGT CCTGAGGGGC AACACACAAG AACCAGGTCC 2520 

CCTCAGCCCC ACAGGCACTG GTCTTTTTTG CTOGANTCCA CCCCCCCCCT CTTTACCCTT 2580 
TT 



SEQ ID NO:204 PA R? Protrfn sequence: 
Protein Accession #: XP_050197 

1 11 21 31 41 51 

I I I I I I 

MVQRLWVSRL LRHRKAOLLL VNLLTFGLEV CLAAGITYVP PLLLEVGVEE KFMTOVLGIG 60 
PVLGLVCVTL LGSASDHWRG RYGRRRPFIW ALSLGILLSL FLIPFAGWLA GLLCPDPRPL 120 
ELALLILGVG LLDFCGQVCF TPLEALLSDL FRDPDHCRQA YSVYAFMISL GGCLGYLLPA 180 
IDWDTSALAP YLGTQEECLF GLLTLIFLTC VAATLLVAEE AALGPTEPAE GLSAPSLSPH 240 
CCPCRARLAF RNLGALLPRL HQLCCRMPRT LRRLFVAELC SWMALMTFTL FYTDFVGEGL 300 
YQGVPRAEPG TEARRHYDEG VRMGSLGLFL QCAISLVFSL VMDRLVQRFG TRAVYLASVA 360 
AFPVAAGATC LSHSVAWTA SAALTGFTFS ALQILPYTLA SLYHREKOVF LPKYRGDTGG 420 
ASSEDSLMTS FLPGPKPGAP FPNGHVGAGG SGLLPPPPAL CGASACDVSV RVWGEPTEA 480 
RWPGRGICL DLAILDSAFL LSQVAPSLFM GSIVQLSQSV TAYMVSAAGL GLVAIYFATQ 540 
WFDKSDLAK YSA 

SEQ ID NO:205 PAJ3 DNA SEQUENCE 

Nucleic Acid Accession*: AK002126 

Coding sequence: 1-1 593 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 . 

I I I I I I 

ATGGTTCGCC GGGGGCTGCT TGCGTGGATT TCCCGGGTGG TGGTTTTGCT GGTGCTCCTC 60 

TGCTGTGCTA TCTCTGTCCT GTACATGTTG GCCTGCACCC CAAAAGGTGA CGAGGAGCAG 120 

CTGGCACTGC CCAGGGCCAA CAGCCCCACG GGGAAGGAGG GGTACCAGGC CGTCCTTCAG 180 

GAGTGGGAGG AGCAGCACCG CAACTACGTG AGCAGCCTGA AGCGGCAGAT CGCACAGCTC 240 
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AAGGAGGAGC TGCAGGAGAG GAGTGAGCAG CTCAGGAATG GGCAGTACCA AGCCAGCGAT 300 

GCTGCTGGCC TGGGTCTGGA CAGGAGCCCC CCAGAGAAAA CCCAGGCCGA CCTCCTGGCC 360 

TTCCTGCACT CGCAGGTGGA CAAGGCAGAG GTGAATGCTG GCGTCAAGCT GGCCACAGAG 420 

TATGCAGCAG TGCCTTTCGA TAGCTTTACT CTACAGAAGG TGTACCAGCT GGAGACTGGC 480 

CTTACCCGCC ACCCCGAGGA GAAGCCTGTG AGGAAGGACA AGCGGGATGA GTTGGTGGAA 540 

GCCATTGAAT CAGCCTTGGA GACCCTGAAC AATCCTGCAG AGAACAGCCC CAATCACCGT 600 

CCTTACACGG CCTCTGATTT CATAGAAGGG ATCTACCGAA CAGAAAGGGA CAAAGGGACA 660 

TTGTATGAGC TCACCTTCAA AGGGGACCAC AAACACGAAT TCAAACGGCT CATCTTATTT 720 

CGACCATTCG GCCCCATCAT GAAAGTGAAA AATGAAAAGC TCAACATGGC CAACACGCTT 780 

ATCAATGTTA TCGTGCCTCT AGCAAAAAGG GTGGACAAGT TCCGGCAGTT CATGCAGAAT 840 

TTCAGGGAGA TGTGCATTGA GCAGGATGGG AGAGTCCATC TCACTGTTGT TTACTTTGGG 900 

AAAGAAGAAA TAAATGAAGT CAAAGGAATA CTTGAAAACA CTTCCAAAGC TGCCAACTTC 960 

AGGAACTTTA CCTTCATCCA GCTGAATGGA GAATTTTCTC GGGGAAAGGG ACTTGATGTT 1020 

GGAGCCCGCT TCTGGAAGGG AAGCAACGTC CTTCTCTTTT TCTGTGATGT GGACATCTAC 1080 

TTCACATCTG AATTCCTCAA TACGTGTAGG CTGAATACAC AGCCAGGGAA GAAGGTATTT 1140 

TATCCAGTTC TTTTCAGTCA GTACAATCCT GGCATAATAT ACGGCCACCA TGATGCAGTC 1200 

CCTCCCTTGG AACAGCAGCT GGTCATAAAG AAGGAAACTG GATTTTGGAG AGACTTTGGA 1260 

TTTGGGATGA CGTGTCAGTA TCGGTCAGAC TTCATCAATA TAGGTGGGTT TGATCTGGAC 1320 

ATCAAAGGCT GGGGCGGAGA GGATGTGCAC CTTTATCGCA AGTATCTCCA CAGCAACCTC 1380 

ATAGTGGTAC GGACGCCTGT GCGAGGACTC TTCCACCTCT GGCATGAGAA GCGCTGCATG 1440 

GACGAGCTGA CCCCCGAGCA GTACAAGATG TGCATGCAGT CCAAGGCCAT GAACGAGGCA 1500 

TCCCACGGCC AGCTGGGCAT GCTGGTGTTC AGGCACGAGA TAGAGGCTCA CCTTCGCAAA 1560 
CAGAAACAGA AGACAAGTAG CAAAAAAACA TGA 



$EQ |D NP:20g.P>W3 W seguence; 
Protein Accession #: NPJJ60841 

1 11 21 31 41 51 

I I I I I I 

MVRRGLLAWI SRVWLLVLL CCAISVLYML ACTPKGDEEQ LALPRANSPT GKEGYQAVLQ 60 

EWEEQHRNYV SSLKRQIAQL KEELQERSEQ LRNGQYQASD AAGLGLDRSP PEKTQADLLA 120 

FLHSQVDKAE VNAGVKLATE YAAVPFDSFT LQKVYQLETG LTRHPEEKPV RKDKRDELVE 180 

AIESALETLN NPAENSPNHR PYTASDFIEG IYRTERDKGT LYELTFKGDH KEEFKRLILF 240 

RPFGPIMKVK NEKLNMANTL INVTVPLAKR VDKFRQFMQN FREMCIEQDG RVHLTWYFG 300 

KEEINEVKGI LENTSKAANF RNFTFIQLNG EFSRGKGLDV GARFWKGSNV LLFFCDVDIY 360 

FTSEFLNTCR LNTQPGKKVF YPVLFSQYNP GIIYGHHDAV PPLEQQLVTK KETGFWRDFG 420 

FGMTCQYRSD FINIGGFDLD IKGWGGEDVH LYRKYLHSNL IWRTPVRGL FHLWHEKRCM 480 
DELTPEQYKM CMQSKAMNEA SHGQLGMLVF RHHEAHLRK QKQKTSSKKT 

SEQ f D NO;207 PAJ5 DNA SEQUENCE 

Nucleic Acid Accession*: AF189723 

Coding sequence: 1-2712 (undefined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGATTCCTG TATTGACATC AAAAAAAGCA AGTGAATTAC CAGTCAGTGA AGTTGCAAGC 60 

ATTCTCCAAG CTGATCTTCA GAATGGTCTA AACAAATGTG AAGTTAGTCA TAGGCGAGCC 120 

TTTCATGGCT GGAATGAGTT TGATATTAGT GAAGATGAGC CACTGTGGAA GAAGTATATT 180 

TCTCAGTTTA AAAATCCCCT TATTATGCTG CTTCTGGCTT CTGCAGTCAT CAGTGTTTTA 240 

ATGCATCAGT TTGATGATGC CGTCAGTATC ACTGTGGCAA TACTTATCGT TGTTACAGTT 300 

GCCTTTGTTC AGGAATATCG TTCAGAAAAA TCTCTTGAAG AATTGAGTAA ACTTGTGCCA 360 

CCAGAATGCC ATTGTGTGCG TGAAGGAAAA TTGGAGCATA CACTTGCCCG AGACTTGGTT 420 

CCAGGTGATA CAGTTTGCCT TTCTGTTGGG GATAGAGTTC CTGCTGACTT ACGCTTGTTT 480 

GAGGCTGTGG ATCTTTCCAT TGATGAGTCC AGCTTGACAG GTGAGACAAC GCCTTGTTCT 540 

AAGGTGACAG CTCCTCAGCC AGCTGCAACT AATGGAGATC TTGCATCGAG AAGTAACATT 600 

GCCTTTATGG GAACACTGGT CAGATGTGGC AAAGCAAAGG GTGTTGTCAT TGGAACAGGA 660 

GAAAATTCTG AATTTGGGGA GGTTTTTAAA ATGATGCAAG CAGAAGAGGC ACCAAAAACC 720 

CCTCTGCAGA AGAGCATGGA CCTCTTAGGA AAACAACTTT CCTTTTACTC CTTTGGTATA 780 

ATAGGAATCA TCATGTTGGT TGGCTGGTTA CTGGGAAAAG ATATCCTGGA AATGTTTACT 840 

ATTAGTGTAA GTTTGGCTGT AGCAGCAATT CCTGAAGGTC TOCCCATTGT GGTCACAGTG 900 

ACGCTAGCTC TTGGTGTTAT GAGAATGGTG AAGAAAAGGG CCATTGTGAA AAAGCTGCCT 960 

ATTGTTGAAA CTCTGGGCTG CTGTAATGTG ATTTGTTCAG ATAAAACTGG AACACTGACG 1020 

AAGAATGAAA TGACTGTTAC TCACATATTT ACTTCAGATG GTCTGCATGC TGAGGTTACT 1080 

GGAGTTGGCT ATAATCAATT TGGGGAAGTG ATTGTTGATG GTGATGTTGT TCATGGATTC 1140 

TATAACCCAG CTGTTAGCAG AATTGTTGAG GCGGGCTGTG TGTGCAATGA TGCTGTAATT 1200 

AGAAACAATA CTCTAATGGG GAAGCCAACA GAAGGGGCCT TAATTGCTCT TGCAATGAAG 1260 

ATGGGTCTTG ATGGACTTCA ACAAGACTAC ATCAGAAAAG CTGAATACCC TTTTAGCTCT 1320 

GAGCAAAAGT GGATGGCTGT TAAGTGTGTA CACCGAACAC AGCAGGACAG ACCAGAGATT 1380 

TGTTTTATGA AAGGTGCTTA CGAACAAGTA ATTAAGTACT GTACTACATA CCAGAGCAAA 1440 

GGGCAGACCT TGACACTTAC TCAGCAGCAG AGAGATGTGT ACCAACAAGA GAAGGCACGC 1500 

ATGGGCTCAG CGGGACTCAG AGTTCTTGCT TTGGCTTCTG GTCCTGAACT GGGACAGCTG 1560 

ACATTTCTTG GCTTGGTGGG AATCATTGAT CCACCTAGAA CTGGTGTGAA AGAAGCTGTT 1620 

ACAACACTCA TTGCCTCAGG AGTATCAATA AAAATGATTA CTGGAGATTC ACAGGAGACT 1680 

GCAGTTGCAA TCGCCAGTCG TCTGGGATTG TATTCCAAAA CTTCCCAGTC AGTCTCAGGA 1740 

GAAGAAATAG ATGCAATGGA TGTTCAGCAG CTTTCACAAA TAGTACCAAA GGTTGCAGTA 1800 

TTTTACAGAG CTAGCCCAAG GCACAAGATG AAAATTATTA AGTCGCTACA GAAGAACGGT 1860 

TCAGTTGTAG CCATGACAGG AGATGGAGTA AATGATGCAG TTGCTCTGAA GGCTGCAGAC 1920 
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ATTGGAGTTG CGATGGGCCA GACTGGTACA GATGTTTGCA AAGAGGCAGC AGACATGATC 1980 

CTA6T6GATG ATGATTTTCA AACCATAATG TCTGCAATCG AAGAGGGTAA AGGGATTTAT 2040 

AATAACATTA AAAATTTCGT TAGATTCCAG CTGAGCACGA GTATAGCAGC ATTAACTTTA 2100 

ATCTCATTGG CTACATTAAT GAACTTTCCT AATCCTCTCA ATGCCATGCA GATTTTGTGG 2160 

ATCAATATTA TTATGGATGG ACCCCCAGCT CAGAGCCTTG GAGTAGAACC AGTGGATAAA 2220 

GATGTCATTC GTAAACCTCC TCGCAACTGG AAAGACAGCA TTTTGACTAA AAACTTGATA 2280 

CTTAAAATAC TTGTTTCATC AATAATCATT GTTTGTGGGA CTTTGTTTGT CTTCTGGCGT 2340 

GAGCTACGAG ACAATGTGAT TACACCTCGA GACACAACAA TGACCTTCAC ATGCTTTGTG 2400 

TTTTTTGACA TGTTCAATGC ACTAAGTTCC AGATCCCAGA CCAAGTCTGT GTTTGAGATT 2460 

GGACTCTGCA GTAATAGAAT GTTTTGCTAT GCAGTTCTTG GATCCATCAT GGGACAATTA 2520 

CTAGTTATTT ACTTTOCTCC GCTTCAGAAG GTTTTTCAGA CTGAGAGCCT AAGCATACTG 2580 

GATCTGTTGT TTCTTTTGGG TCTCACCTCA TCAGTGTGCA TAGTGGCAGA AATTATAAAG 2640 

AAGGTTGAAA GGAGCAGGGA AAAGATCCAG AAGCATGTTA GTTCGACATC ATCATCTTTT 2700 
CTTGAAGT AT GA 

Protein Accession #: AAF2781 3 

1 11 21 31 41 51 

I I I I I I 

MIFVLTSKKA SELPVSEVAS ILQADLQNGL NKCEVSHRRA FHGWNEFDIS EDEPLWKKYI 60 

SQFKNPLIML LLASAVISVL MHQFDDAVSI TVAILIWTV AFVQEYRSEK SLEELSKLVP 120 

PECHCVREGK LEHTLARDLV PGDTVCLSVG DRVPADLRLF EAVDLSIDES SLTGETTPCS 180 

KVTAPQPAAT NGDLASRSNI AFMGTLVRCG KAKGWIGTG ENSEFGEVFK MMQAEEAPKT 240 

PLQKSMDLLG KQLSFYSFGI IGIIMLVGWL LGKDILEMFT ISVSLAVAAI PEGLPIWTV 300 

TLALGVMRMV KKRAIVKKLP IVETLGCCNV ICSDKTGTLT KNEMTVTHIF TSDGLHAEVT 360 

GVGYNGFGEV IVDGEWHGF YNPAVSRIVE AGCVCNDAVI KNNTLMGKPT EGALIALAMK 420 

MGLDGLQQDY IRKAEYPFSS EQKWMAVKCV HRTQQDRPEI CFMKGAYEQV IKYCTTYQSK 480 . 

GQTLTLTQQQ RDVYQQEKAR MGSAGLRVLA LASGPELGQL TFLGLVGIID PPRTGVKEAV 540 

TTLIASGVSI KMITGDSQET AVAIASRLGL YSKTSQSVSG EEIDAMDVQQ LSQIVPKVAV 600 

FYRASPRHKM KIIKSLQKNG SWAMTGDGV NDAVALKAAD IGVAHGOTGT DVCKEAADMI 660 

LVDDDFQTXH SAIEEGKGIY NNIKNFVRFQ LSTSIAALTL ISLATLMNFP NPLNAMQILW 720 

INIIMDGPPA QSLGVEPVDK DVIRKPPRNW KDSILTKNLI LKILVSSIII VCGTLFVFWR 780 

ELRDNVITPR DTTMTFTCFV FFDMFNALSS RSQTKSVFEI GLCSNRMFCY AVLGSIMGQL 840 

LVTYFPPLQK VFQTESLSIL DLLFLLGLTS SVCIVAEIIK KVERSREKIQ KHVSSTSSSF 900 
LEV 

SEQ 10 NO:209 PAV4 VARIANT 1 DNA SEQUENCE 

Nucleic Add Accession!: N62096 

Coding sequence: 1-1284 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

ill III 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGGATTGCC TTATTCAATG . 60 

AAGCAAGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 120 

TTTTCCCTTG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG CTTTCCAGGG TATCTGCTCC TCTCTGTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTGGTCGCCA CTTCATTATT 360 

GGACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 540 

CCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 600 

TTCTTAGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT GCAGAAATGA TGACCTGGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 840 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCACGCTTG TGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTAfG 1080 

TCTTGTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

ACAAATACTC AAGACTGCAC CCATGGGCAG GAAATGTTCT ACTGCTTTCC TGACAATTTC 1200 

TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 



SEQ 10 NO:210 PAV4 Varfarrt 1 Protefn serene* 
Protein Accession #: none found 

l n 21 31 41 51 

I I I I I I 

MGYQRQEPVI PFQRGLPYSM KQAGFPLGIL LLFWVSYVTD FSLVLLIKGG ALSGTDTYQS 60 
LVNKTFGFPG YLLLSVLQFL YPFIAMISYN IIAGDTLSKV FQRIPGVDPE NVFIGRHFII 120 
GLSTVTFTLP LSLYRNIAKL GKVSLISTGL TTLILGIVMA RAISLGPHIP KTEDAWVFAK 180 
PNAIQAVGVK SFAFICHHNS FLVYSSLEEP TVAKWSRLIH MSIVISVFIC IFFATCGYLT 240 
FTGFTQGDLF ENYCRNDDLV TFGRFCYGVT VILTYPMECF VTREVXANVF FGGNLSSVFH 300 
IWTVMVITV ATLVSLLIDC LGIVLELNGV LCATPLIFII PSACYLKLSE EPRTHSDKIM 360 
SCVMLPIGAV VHVFGFVMAI TNTQDCTHGQ EMFYCFPDNF SLTNTSESHV QQTTQLSTLN 420 
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ISIPQLE 

_ SEQ ID NO:21 1 PAV4 VARIANT 2 DNA SEQUENCE 

5 Nucleic Acid Accession*: N62096 

Coding sequence: M203 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

10 I I I I I 'I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGT TTTCCCTTGT TTTATTGATA 60 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGGC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 240 

15 GATCCTGAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 360 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACATACCAA AAACAGAAGA CGCTTGGGTA TTTGCAAAGC CCAATGCCAT TCAAGCGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTCCT TCTTAGTTTA CAGTTCTCTA 540 

20 GAAGAACCCA CAGTAGCTAA GTGGTCCCGC CTTATCCATA TGTCCATCGT GATTTCTGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

- AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

25 ATCACTGTAG CCACGCTTGT GTCATTGCTG ATTGATTGCC TCGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATCATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GGTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1080 

A CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

30 TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1200 
TAA 

SEQ ID N0312 PAV4 Variant 2 Protein sequence: 
DD Protein Accession #: none found 

1 11 21 31 41 51 

An I I I I 1 1 

40 MGYQRQEPVT PPOFSLVLLI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QFLYPFIAMI 60 

SYNIIAGDTL SKVFQRIPGV DPENVFIGRH FIIGLSTVTF TLPLSLYRNI AKLGKVSLIS 120 

TGLTTLILGI VMARAISLGP HIPKTEDAWV FAKPNAIQAV GVMSFAFICH HNSFLVYSSL 180 

EEPTVAKWSR LIHMSIVISV FICIFFATCG YLTFTGFTQG DLFENYCRND DLVTFGRFCY 240 

GVTVILTYPM ECFVTREVIA NVFFGGNLSS VFHIWTVMV ITVATLVSLL IDCLGIVLEL 300 

45 NGVLCATPLI FIIPSACYLK LSEEPRTHSD KIMSCVMLPI GAWMVFGFV MAITNTQDCT 360 
HGQEMFYCFP DNFSLTNTSE SHVQQTTQLS TLNISIFQLE 

SEQ ID NO:213 PAV4 VARIANT 3 DNA SEQUENCE 

Nudete Add Accession* N62096 
50 Coding sequence: 1*1 140 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

« I I I I I I 

55 ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGG TCAATAAAAC TTTCGGCTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC CTTTTATAGC AATGATAAGT 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

^ CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

60 GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 480 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTGTATTT 540 

- ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 600 

65 TTATTTGAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 720 

GTGTTTTTTG GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

70 TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCAGC AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCGAGTAA 



75 



SEQ ID NO:214 PAV4 Variant 3 Pmtrin seouence: 
Protein Accession #: none found 



Arv 1 11 21 31 41 51 

80 | | | | | | 
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MGYQRQEPVT PPQVNKTFGF PGYLLLSVLQ FLYPFIAMIS YNIIAGDTLS KVFQRIPGVD 60 

PENVFIGRHF IIGLSTVTFT LPLSLYRNIA KLGKVSLIST GLTTLILGIV MARAISLGPH 120 

IPKTEDAWVF AKFNAIQAVG VKSFAFICHH NSFLVYSSLE EPTVAKWSRL IHMSIVISVF 180 

ICIFFATCGY LTFTGFTQGD LFENYCRNDD LVTFGRFCYG VTVILTYPHE CFVTREVIAN 240 

VFFGGNLSSV FHIWTVMVI TVATLVSLLI DCLGIVLELN GVLCATPLIF IIPSACYLKL 300 

SEEPRTHSDK IMSCVMLPIG AWMVFGFVM AITNTQDCTH GQEMFYCFPD NFSLTNTSES 360 
HVQQTTQLST LNISIFQLE 



SEQ ID N0:215 PAV4 VARIANT 4 DNA SEQUENCE 

Nucleic Acid Accession!: N62096 

Coding sequence: 1-1389 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGATTTAGA TGACAGAGAA 60 

ACCCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCTGC TCTTTTTAAT 120 

GTTGTCAACT CGATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT 180 

GGGTTTCCTT TGGGAATATT GCTTTTATTC TGGGTTTCAT ATGTTACAGA CTTTTCCCTT 240 

GTTTTATTGA TAAAAGGAGG GGCCCTCTCT GGAACAGATA CCTACCAGTC TTTGGTCAAT 300 

AAAACTTTCG GCTTTCCAGG GTATCTGCTC CTCTCTGTTC TTCAGTTTTT GTATCCTTTT 360 

ATAGCAATGA TAAGTTACAA TATAATAGCT GGAGATACTT TGAGCAAAGT TTTTCAAAGA 420 

ATCCCAGGAG TTGATCCTGA AAACGTGTTT ATTGGTCGCC ACTTCATTAT TGGACTTTCC 480 

ACAGTTACCT TTACTCTGCC TTTATCCTTG TACCGAAATA TAGCAAAGCT TGGAAAGGTC 540 

TCCCTCATCT CTACAGGTTT AACAACTCTG ATTCTTGGAA TTGTAATGGC AAGGGCAATT 600 

TCACTGGGTC CACACATACC AAAAACAGAA GACGCTTGGG TATTTGCAAA GCCCAATGCC 660 

ATTCAAGCGG TCGGGGTTAT GTCTTTTGCA TTTATTTGCC ACCATAACTC CTTCTTAGTT 720 

TACAGTTCTC TAGAAGAACC CACAGTAGCT AAGTGGTCCC GCCTTATCCA TATGTCCATC 780 

GTGATTTCTG TATTTATCTG TATATTCTTT GCTACATGTG GATACTTGAC ATTTACTGGC 840 

TTCACCCAAG GGGACTTATT TGAAAATTAC TGCAGAAATG ATGACCTGGT AACATTTGGA 900 

AGATTTTGTT ATGGTGTCAC TGTCATTTTG ACATACCCTA TGGAATGCTT TGTGACAAGA 960 

GAGGTAATTG CCAATGTGTT TTTTGGTGGG AATCTTTCAT CGGTTTTCCA CATTGTTGTA 1020 

ACAGTGATGG TCATCACTGT AGCCACGCTT GTGTCATTGC TGATTGATTG CCTCGGGATA 1080 

GTTCTAGAAC TCAATGGTGT GCTCTGTGCA ACTCCCCTCA TTTTTATCAT TCCATCAGCC 1140 

TGTTATCTGA AACTGTCTGA AGAACCAAGG ACACACTCCG ATAAGATTAT GTCTTGTGTC 1200 

ATGCTTCCCA TTGGTGCTGT GGTGATGGTT TTTGGATTCG TCATGGCTAT TACAAATACT 1260 

CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 



SEQ ID NCM16 PAV4 Variant 4 Protein seouence: 
Protein Accession*: none found 



1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVT PPQRDLDDRE TLVSEHEYKE KTCQSAALFN WNSIIGSGI IGLPYSMKQA 60 

GFPLGILLLF WVSYVTDFSL VLLIKGGALS GTDTYQSLVN KTFGFPGYLL LSVLQFLYPF 120 

IAMISYNIIA GDTLSKVFQR IPGVDPENVF IGRHFIIGLS TVTFTLPLSL YRNIAKLGKV 180 

SLISTGLTTL ILGIVMARAI SLGPHIPKTE DAWVFAKPNA IQAVGVMSFA FICHHNSFLV 240 

YSSLEEPTVA KWSRLIHMSI VISVFICIFF ATCGYLTFTG FTQGDLFENY CRNDDLVTFG 300 

KFCYGVTVIL TYPMECFVTR EVTANVFFGG NLSSVFHIW TVMVITVATL VSLLIDCLGI 360 

VLELNGVLCA TPLIFIIPSA CYLKLSEEPR THSDKIMSCV MLPIGAWMV FGFVMAITNT 420 
ODCTHGQEMF YCFPDNFSLT NTSESHVQQT TQLSTLNISI FQ 

SEQ ID N0:217 PAV9 DNA SEQUENCE 

Nucleic Acid Accession «: NM_017636 

Cocfing sequence: 1-3501 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGAGGATG CCTTCGGGGC AGCCGTGGTG ACCGTGTGGG ACAGCGATGC ACACACCACG 60 

GAGAAGCCCA CCGATGCCTA CGGAGAGCTG GACTTCACGG GGGCCGGCCG CAAGCACAGC 120 

AATTTCCTCC GGCTCTCTGA CCGAACGGAT CCAGCTGCAG TTTATAGTCT GGTCACACGC 180 

ACATGGGGCT TCCGTGCCCC GAACCTGGTG GTGTCAGTGC TGGGGGGATC GGGGGGCCCC 240 

GTCCTCCAGA CCTGGCTGCA GGACCTGCTG CGTCGTGGGC TGGTGCGGGC TGCCCAGAGC 300 

ACAGGAGCCT GGATTGTCAC TGGGGGTCTG CACACGGGCA TCGGCCGGCA TGTTGGTGTG 360 

GCTGTACGGG ACCATCAGAT GGCCAGCACT GGGGGCACCA AGGTGGTGGC CATGGGTGTG 420 

GCCCCCTGGG GTGTGGTCCG GAATAGAGAC ACCCTCATCA ACCCCAAGGG CTCGTTCCCT 480 

GCGAGGTACC GGTGGCGCGG TGACCCGGAG GACGGGGTCC AGTTTCCCCT GGACTACAAC 540 

TACTCGGCCT TCTTCCTGGT GGACGACGGC ACACACGGCT GCCTGGGGGG CGAGAACCGC 600 

TTCCGCTTGC GCCTGGAGTC CTACATCTCA CAGCAGAAGA CGGGCGTGGG AGGGACTGGA 660 

ATTGACATCC CTGTCCTGCT CCTCCTGATT GATGGTGATG AGAAGATGTT GACGCGAATA 720 

GAGAACGCCA CCCAGGCTCA GCTCCCATGT CTCCTCGTGG CTGGCTCAGG GGGAGCTGCG 780 

GACTGCCTGG CGGAGACCCT GGAAGACACT CTGGCCCCAG GGAGTGGGGG AGCCAGGCAA 840 

GGCGAAGCCC GAGATCGAAT CAGGCGTTTC TTTCCCAAAG GGGACCTTGA GGTCCTGCAG 900 

GCCCAGGTGG AGAGGATTAT GACCCGGAAG GAGCTCCTGA CAGTCTATTC TTCTGAGGAT 960 
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GGGTCTGAGG AATTCGAGAC CATAGTTTTG AAGGCCCTTG TGAAGGCCTG TGGGAGCTCG 1020 

GAGGCCTCAG CCTACCTGGA TGAGCTGCGT TTGGCTGTGG CTTGGAACCG CGTGGACATT 1080 

GCCCAGAGTG AACTCTTTCG GGGGGACATC CAATGGCGGT CCTTCCATCT CGAAGCTTCC 1140 

CTCATGGACG CCCTGCTGAA TGACCGGCCT GAGTTCGTGC GCTTGCTCAT TTCCCACGGC 1200 

CTCAGCCTGG GCCACTTCCT GACCCCGATG CGCCTGGCCC AACTCTACAG CGCGGCGCCC 1260 

TCCAACTCGC TCATCCGCAA CCTTTTGGAC CAGGCGTCCC ACAGCGCAGG CACCAAAGCC 1320 

CCAGCCCTAA AAGGGGGAGC TGCGGAGCTC CGGCCCCCTG ACGTGGGGCA TGTGCTGAGG 1380 

ATGCTGCTGG GGAAGATGTG CGCGCCGAGG TACCCCTCCG GGGGCGCCTG GGACCCTCAC 1440 

CCAGGCCAGG GCTTCGGGGA GAGCATGTAT CTGCTCTCGG ACAAGGCCAC CTCGCCGCTC 1500 

TCGCTGGATG CTGGCCTCGG GCAGGCCCCC TGGAGCGACC TGCTTCTTTG GGCACTGTTG 1560 

CTGAACAGGG CACAGATGGC CATGTACTTC TGGGAGATGG GTTCCAATGC AGTTTCCTCA 1620 

GCTCTTGGGG CCTGTTTGCT GCTCCGGGTG ATGGCACGCC TGGAGCCTGA CGCTGAGGAG 1680 

GCAGCACGGA GGAAAGAOCT GGCGTTCAAG TTTGAGGGGA TGGGCGTTGA CCTCTTTGGC 1740 

GAGTGCTATC GCAGCAGTGA GGTGAGGGCT GCCCGCCTCC TCCTCCGTCG CTGCCCGCTC 1800 

TGGGGGGATG CCACTTGCCT CCAGCTGGCC ATGCAAGCTG ACGCCCGTGC CTTCTTTGCC 1860 

CAGGATGGGG TACAGTCTCT GCTGACACAG AAGTGGTGGG GAGATATGGC CAGCACTACA 1920 

CCCATCTGGG CCCTGGTTCT CGCCTTCTTT TGCCCTCCAC TCATCTACAC CCGCCTCATC 1980 

ACCTTCAGGA AATCAGAAGA GGAGCCCACA CGGGAGGAGC TAGAGTTTGA CATGGATAGT 2040 

GTCATTAATG GGGAAGGGCC TGTCGGGACG GCGGACCCAG CCGAGAAGAC GCCGCTGGGG 2100 

GTCCCGCGCC AGTCGGGCCG TCCGGGTTGC TGCGGGGGCC GCTGCGGGGG GCGCCGGTGC 2160 

CTACGCCGCT GGTTCCACTT CTGGGGCGCG CCGGTGACCA TCTTCATGGG CAACGTGGTC 2220 

AGCTACCTGC TGTTCCTGCT GCTTTTCTCG CGGGTGCTGC TCGTGGATTT CCAGCCGGCG 2280 

CCGCCCGGCT CCCTGGAGCT GCTGCTCTAT TTCTGGGCTT TCACGCTGCT GTGCGAGGAA 2340 

CTGCGCCAGG GCCTGAGCGG AGGCGGGGGC AGCCTCGCCA GCGGGGGCCC CGGGCCTGGC 2400 

CATGCCTCAC TGAGCCAGCG CCTGCGCCTC TACCTCGCCG ACAGCTGGAA CCAGTGCGAC 2460 

CTAGTGGCTC TCACCTGCTT CCTCCTGGGC GTGGGCTGCC GGCTGACCCC GGGTTTGTAC 2520 

CACCTGGGCC GCACTGTCCT CTGCATCGAC TTCATGGTTT TCACGGTGCG GCTGCTTCAC 2580 

ATCTTCACGG TCAACAAACA GCTGGGGCCC AAGATCGTCA TCGTGAGCAA GATGATGAAG 2640 

GACGTGTTCT TCTTCCTCTT CTTCCTCGGC GTGTGGCTGG TAGCCTATGG CGTGGCCACG 2700 

GAGGGGCTCC TGAGGCCACG GGACAGTGAC TTCCCAAGTA TCCTGCGCCG CGTCTTCTAC 2760 

CGTCCCTACC TGCAGATCTT CGGGCAGATT CCCCAGGAGG ACATGGACGT GGCCCTCATG 2820 

GAGCACAGCA ACTGCTCGTC GGAGCCCGGC TTCTGGGCAC ACCCTCCTGG GGCCCAGGCG 2880 

GGCACCTGCG TCTCCCAGTA TGCCAACTGG CTGGTGGTGC TGCTCCTCGT CATCTTCCTG 2940 

CTCGTGGCCA ACATCCTGCT GGTCAACTTG CTCATTGCCA TGTTCAGTTA CACATTCGGC 3000 

AAAGTACAGG GCAACAGCGA TCTCTACTGG AAGGCGCAGC GTTACCGCCT CATCCGGGAA 3060 

TTCCACTCTC GGCCCGCGCT GGCCCCGCCC TTTATCGTCA TCTCCCACTT GCGCCTCCTG 3120 

CTCAGGCAAT TGTGCAGGCG ACCCCGGAGC CCCCAGCCGT CCTCCCCGGC CCTCGAGCAT 3180 

TTCCGGGTTT ACCTTTCTAA GGAAGCCGAG CGGAAGCTGC TAACGTGGGA ATCGGTGCAT 3240 

AAGGAGAACT TTCTGCTGGC ACGCGCTAGG GACAAGCGGG AGAGCGACTC CGAGCGTCTG 3300 

AAGCGCACGT CCCAGAAGGT GGACTTGGCA CTGAAACAGC TGGGACACAT CCGCGAGTAC 3360 

GAACAGCGCC TGAAAGTGCT GGAGCGGGAG GTCCAGCAGT GTAGCCGCGT CCTGGGGTGG 3420 

GTGGCCGAGG CCCTGAGCCG CTCTGCCTTG CTGCCCCCAG GTGGGCCGCC ACCCCCTGAC 3480 
CTGCCTGGGT CCAAAGACTG A 



?W IP NEW PAW Protein wqwngff 

Protein Accession #; none found 

1 11 21 31 41 51 

I I I I I I 

MEDAFGAAW TVWDSDAHTT EKPTDAYGEL DFTGAGRKHS NFLRLSDRTD PAAVYSLVTR 60 

TWGFRAPNLV VSVLGGSGGP VLQTWLQDLL RRGLVRAAQS TGAWIVTGGL HTGIGRHVGV 120 

AVHDHQMAST GGTKWAMGV APWGWKNRD TLINPKGSFP ARYRWRGDPE DGVQFPLDYN 180 

YSAFFLVDDG THGCLGGENR FRLRLESYIS QQKTGVGGTG IDIPVLLLLI DGDEKMLTRI 240 

ENATQAQLPC LLVAGSGGAA DCLAETLEDT LAPGSGGARQ GEARDRIRRF FPXGDLEVLQ 300 

AQVERIMTRK ELLTVYSSED GSEEFETIVL KALVKACGSS EASAYLDELR LAVAWNRVDI 360 

AQSELFRGDI QWRSFHLEAS LMDALLNDRP EFVRLLISHG LSLGHFLTPM RLAQLYSAAP 420 

SNSLIRNLLD QASHSAGTKA PALKGGAAEL RPPDVGHVLR MLLGKMCAPR YPSGGAWDPH 480 

PGQGFGESMY LLSDKATSPL SLDAGLGQAP WSDLLLWALL LNRAQMAMYF WEMGSNAVSS 540 

ALGACLLLRV MARLEPDAEE AARRKDLAFK FEGMGVDLFG ECYRSSEVRA ARLLLRRCPL 600 

WGDATCLQLA MQADARAFFA QDGVQSLLTQ KWWGDMASTT P1WALVLAFF CPPLIYTRLI 660 

TFRKSEEEPT REELEFDMDS VINGEGPVGT ADPAEKTPLG VPRQSGRPGC CGGRCGGRRC 720 

LRRWFHFWGA PVTIFMGNW SYLLFLLLFS RVLLVDFQPA PPGSLELLLY FWAFTLLCEE 780 

LRQGLSGGGG SLASGGPGPG HASLSQRLRL YLADSWNQCD LVALTCFLLG VGCRLTPGLY 840 

HLGRTVLCID FMVFTVRLLH IFTVNKQLGP KIVIVSKMHK DVFFFLFFLG VWLVAYGVAT 900 

EGLLRPRDSD FPSILRRVFY RPYLQIFGQI PQEDMDVALM EHSNCSSEPG FWAHPPGAQA 960 

GTCVSQYANW LWLLLVIFL LVANILLVNL LIAMFSYTFG KVQGNSDLYW KAQRYRLIRE 1020 

FHSRPALAPP FIVISHLRLL LRQLCRRPRS PQPSSPALEH FRVYLSKEAE RKLLTWESVH 1080 

KENFLLARAR DKRESDSERL KRTSQKVDLA LKQLGHIREY EQRLKVLERE VQQCSRVLGW 1140 
VAEALSRSAL LPPGGPPPPD LPGSKD 

SEQ ID N0^19 PBF1 DNA SEQUENCE 

Nucleic Add Accession #: AA054237 

Coding sequence: 1-894 (underlined sequences correspond to start and stop codbns) 

1 11 21 31 41 51 

I I I I I I 

ATGGAGCCGC GGGCGCTCGT CACGGCGCTC AGCCTCGGCC TCAGCCTGTG CTCCCTGGGG 60 

CTGCTCGTCA CGGCCATCTT CACCGACCAC TGGTACGAGA CCGACCCCCG GCGCCACAAG 120 

GAGAGCTGCG AGCGCAGOCG CGCGGGCGCC GACCCCCCGG ACCAGAAGAA CCGCCTGATG 180 
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PCT/US01/32045 



CCGCTGTCGC ACCTGCCGCT GCGGGACTCG CCCCCGCTGG GGCGCCGGCT GCTCCCGGGC 240 

GGCCCGGGGC GCGCCGACCC CGAGTCCTGG CGCTCGCTCC TGGGGCTCGG CGGGCTGGAC 300 

GCCGAGTGCG GCCGGCCCCT CTTCGCCACC TACTCGGGCC TCTGGAGGAA GTGCTACTTC 360 

CTGGGCATCG ACCGGGACAT CGACACCCTC ATCCTGAAAG GTATTGCGCA GCGATGCACG 420 

5 GCCATCAAGT ACCACTTTTC TCAGCCCATC CGCTTGCGAA ACATTCCTTT TAATTTAACC 480 

AAGACCATAC AGCAAGATGA GTGGCACCTG CTTCATTTAA GAAGAATCAC TGCTGGCTTC 540 

CTCGGCATGG CCGTAGCCGT CCTTCTCTGC GGCTGCATTG TGGCCACAGT CAGTTTCTTC 600 

TGGGAGGAGA GCTTGACCCA GCACGTGGCT GGACTCCTGT TCCTCATGAC AGGGATATTT 660 

TGCACCATTT CCCTCTGTAC TTATGCCGCC AGTATCTCGT ATGATTTGAA CCGGCTCCCA 720 

10 AAGCTAATTT ATAGCCTGCC TGCTGATGTG GAACATGGTT ACAGCTGGTC CATCTTTTGC 780 

GCCTGGTGCA GTTTAGGCTT TATTGTGGCA GCTGGAGGTC TCTGCATCGC TTATCCGTTT 840 
ATTAGCCGGA CCAAGATTGC ACAGCTAAAG TCTGGCAGAG ACTCCACGGT ATGA 

15 $EQ ID Nfr220 PBF1 Prfleln sequence; 

Protein Accessions none found 

1 11 21 31 41 51 

on I I I I I I 

ZU MEPRALVTAL SLGLSLCSLG LLVTAIFTDH WYETDFRRHK ESCERSRAGA DPPDQKNRLM 60 

PL5HLFLRDS PPLGRRLLPG GPGRADPESW RSLLGLGGLD AECGRPLFAT YSGLWRKCYF 120 

LGIDRDIDTL ILKGIAQRCT AIKYHFSQPI RLRNIPFNLT KTIQQDEWHL LHLRRITAGF 180 

LGMAVAVLLC GCIVATVSFF WEESLTQHVA GLLFLMTGIF CTISLCTYAA SISYDLNRLP 240 
KLIYSLPADV EHGYSWSIFC AWCSLGFTVA AGGLCIAYPF ISRTKIAQLK SGRDSTV 



25 
30 



SEQ ID NO:221 PCI4 DNA SEQUENCE 

Nucleic Acid Accession *: NM_016570 

Coding sequence: 1- 1 134 (underlined sequences correspond to start and stop codons) 



l 11 21 31 41 51 

I I I I I I 

ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

35 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

40 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

45 TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

50 ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

55 $ZQW*m?WM&VWSm\ 

Protein Accession * NP_057654 

1 11 21 31 41 51 

«n I 1 1 1 1 1 

OU MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTMALLTIM EFSVYQDTWM 60 

KYEYEVDKDF SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 

KEWQRMLQLI QSRLQEEHSL QDVTFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 

VAGNFRTTVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTEKIA 240 

- IDHNQMFQYF ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 

65 HVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKPVNSVPF 360 
EDGHTDNHLP LLENNTH 

SEQ ID N0^23 PEZ3 DNA SEQUENCE 

70 Nuctete Add Accession* NMJJ01935.1 

Coding sequence: 76-2301 (underlined sequences correspond to start and stop codons) 

mmm 1 11 21 31 41 51 

75 | | | | | | 

CGCGCGTCTC CGCCGCCCGC GTGACTTCTG CCTGCGCTCC TTCTCTGAAC GCTCACTTCC 60 

GAGGAGACGC CGACGATGAA GACACOGTGG AAGATTCTTC TGGGACTGCT GGGTGCTGCT 120 

GCGCTTGTCA CCATCATCAC CGTGCCCGTG GTTCTGCTGA ACAAAGGCAC AGATGATGCT 180 

ACAGCTGACA GTCGCAAAAC TTACACTCTA ACTGATTACT TAAAAAATAC TTATAGACTG 240 

80 AAGTTATACT CCTTAAGATG GATTTCAGAT CATGAATATC TCTACAAACA AGAAAATAAT 300 
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ATCTTGGTAT TCAATGCTGA ATATGGAAAC AGCTCAGTTT TCTTGGAGAA CAGTACATTT 360 

GATGAGTTTG GACATTCTAT CAATGATTAT TCAATATCTC CTGATGGGCA GTTTATTCTC 420 

TTAGAATACA ACTACGTGAA GCAATGGAGG CATTCCTACA CAGCTTCATA TGACATTTAT 480 

GATTTAAATA AAAGGCAGCT GATTACAGAA GAGAGGATTC CAAACAACAC ACAGTGGGTC 540 

ACATGGTCAC CAGTGGGTCA TAAATTGGCA TATGTTTGGA ACAATGACAT TTATGTTAAA 600 

ATTGAACCAA ATTTACCAAG TTACAGAATC ACATGGACGG GGAAAGAAGA TATAATATAT 660 

AATGGAATAA CTGACTGGGT TTATGAAGAG GAAGTCTTCA GTGCCTACTC TGCTCTGTGG 720 

TGGTCTCCAA ACGGCACTTT TTTAGCATAT GCCCAATTTA ACGACACAGA AGTCCCACTT 780 

ATTGAATACT CCTTCTACTC TGATGAGTCA CTGCAGTACC CAAAGACTGT ACGGGTTCCA 840 

TATCCAAAGG CAGGAGCTGT GAATCCAACT GTAAAGTTCT TTGTTGTAAA TACAGACTCT 900 

CTCAGCTCAG TCACCAATGC AACTTCCATA CAAATCACTG CTCCTGCTTC TATGTTGATA 960 

GGGGATCACT ACTTGTGTGA TGTGACATGG GCAACACAAG AAAGAATTTC TTTGCAGTGG 1020 

CTCAGGAGGA TTCAGAACTA TTCGGTCATG GATATTTGTG ACTATGATGA ATCCAGTGGA 1080 

AGATGGAACT GCTTAGTGGC ACGGCAACAC ATTGAAATGA GTACTACTGG CTGGGTTGGA 1140 

AGATTTAGGC CTTCAGAACC TCATTTTACC CTTGATGGTA ATAGCTTCTA CAAGATCATC 1200 

AGCAATGAAG AAGGTTACAG ACACATTTGC TATTTCCAAA TAGATAAAAA AGACTGCACA 1260 

TTTATTACAA AAGGCACCTG GGAAGTCATC GGGATAGAAG CTCTAACCAG TGATTATCTA 1320 

TACTACATTA GTAATGAATA TAAAGGAATG CCAGGAGGAA GGAATCTTTA TAAAATCCAA 1380 

CTTATTGACT ATACAAAAGT GACATGCCTC AGTTGTGAGC TGAATCCGGA AAGGTGTCAG 1440 

TACTATTCTG TGTCATTCAG TAAAGAGGCG AAGTATTATC AGCTGAGATG TTCCGGTCCT 1500 

GGTCTGCCCC TCTATACTCT ACACAGCAGC GTGAATGATA AAGGGCTGAG AGTCCTGGAA 1560 

GACAATTCAG CTTTGGATAA AATGCTGCAG AATGTCCAGA TGCCCTCCAA AAAACTGGAC 1620 

TTCATTATTT TGAATGAAAC AAAATTTTGG TATCAGATGA TCTTGCCTCC TCATTTTGAT 1680 

AAATCCAAGA AATATCCTCT ACTATTAGAT GTGTATGCAG GCCCATGTAG TCAAAAAGCA 1740 

GACACTGTCT TCAGACTGAA CTGGGCCACT TACCTTGCAA GCACAGAAAA CATTATAGTA 1800 

GCTAGCTTTG ATGGCAGAGG AAGTGGTTAC CAAGGAGATA AGATCATGCA TGCAATCAAC 1860 

AGAAGACTGG GAACATTTGA AGTTGAAGAT CAAATTGAAG CAGCCAGACA ATTTTCAAAA 1920 

ATGGGATTTG TGGACAACAA ACGAATTGCA ATTTGGGGCT GGTCATATGG AGGGTACGTA 1980 

ACCTCAATGG TCCTGGGATC GGGAAGTGGC GTGTTCAAGT GTGGAATAGC CGTGGCGCCT 2040 

GTATCCCGGT GGGAGTACTA TGACTCAGTG TACACAGAAC GTTACATGGG TCTCCCAACT 2100 

CCAGAAGACA ACCTTGACCA TTACAGAAAT TCAACAGTCA TGAGCAGAGC TGAAAATTTT 2160 

AAACAAGTTG AGTACCTCCT TATTCATGGA ACAGCAGATG ATAACGTTCA CTTTCAGCAG 2220 

TCAGCTCAGA TCTCCAAAGC CCTGGTCGAT GTTGGAGTGG ATTTCCAGGC AATGTGGTAT 2280 

ACTGATGAAG ACCATGGA AT AGC TAGCAGC ACAGCACACC AACATATATA TACCCACATG 2340 

AGCCACTTCA TAAAACAATG TTTCTCTTTA CCTTAGCACC TCAAAATACC ATGCCATTTA 2400 

AAGCTTATTA AAACTCATTT TTGTTTTCAT TATCTCAAAA CTGCACTGTC AAGATGATGA 2460 

TGATCTTTAA AATACACACT CAAATCAAGA AACTTAAGGT TACCTTTGTT CCCAAATTTC 2520 

ATACCTATCA TCTTAAGTAG GGACTTCTGT CTTCACAACA GATTATTACC TTACAGAAGT 2580 

TTGAATTATC CGGTCGGGTT TTATTGTTTA AAATCATTTC TGCATCAGCT GCTGAAACAA 2640 

CAAATAGGAA TTGTTTTTAT GGAGGCTTTG CATAGATTCC CTGAGCAGGA TTTTAATCTT 2700 

TTTCTAACTG GACTGGTTCA AATGTTGTTC TCTTCTTTAA AGGGATGGCA AGATGTGGGC 2760 

AGTGATGTCA CTAGGGCAGG GACAGGATAA GAGGGATTAG GGAGAGAAGA TAGCAGGGCA 2820 

TGGCTGGGAA CCCAAGTCCA AGCATACCAA CACGAGCAGG CTACTGTCAG CTCCCCTCGG 2880 

AGAAGAGCTG TTCACCACGA GACTGGCACA GTTTTCTGAG AAAGACTATT CAAACAGTCT 2940 

CAGGAAATCA AATATCGAAA GCACTGACTT CTAAGTAAAC CACAGCAGTT GAAAGACTCC 3000 

AAAGAAATGT AAGGGAAACT GCCAGCAACG CAGCCCCCAG GTGCCAGTTA TGGCTATAGG 3060 

TGCTACAAAA ACACAGCAAG GGTGATGGGA AAGCATTGTA AATGTGCTTT TAAAAAAAAA 3120 

TACTGATGTT CCTAGTGAAA GAGGCAGCTT GAAACTGAGA TGTGAACACA TCAGCTTGCC 3180 

CTGTTAAAAG ATGAAAATAT TTGTATCACA AATCTTAACT TGAAGGAGTC CTTGCATCAA 3240 

TTTTTCTTAT TTCATTTCTT TGAGTGTCTT AATTAAAAGA ATATTTTAAC TTCCTTGGAC 3300 

TCATTTTAAA AAATGGAACA TAAAATACAA TGTTATGTAT TATTATTCCC ATTCTACATA 3360 
CTATGGAATT TCTCCCAGTC ATTTAATAAA TGTGCCTTCA TTTTTTC 



SEQ ID WO;224 PEZ3 Protein sequence: 

Proleln Accessions NPJ01926.1 

1 11 21 31 41 51 

I I I I I I 

MKTPWKILIiG LLGAAALVTI ITVFWLLNK GTDDATADSR KTYTLTDYLK NTYRLKLYSL 60 

RWISDHEYLY KQENNILVFN AEYGNSSVFL ENSTFDEFGH SINDYSISPD GQFILLEYNY 120 

VKQWRHSYTA SYDIYDLNKR QLITEERIPN NTQWVTWSPV GHKLAYVWNN DIYVKIEPNL 180 

PSYRITWTGK EDIIYNGITD WVYEEEVPSA YSALWWSPNG TFLAYAQFND TEVPLIEYSF 240 

YSDESLQYPK TVRVPYPKAG AVNPTVKFFV VNTDSLSSVT NATSIQITAP ASMLIGDHYL 300 

CDVTWATQER ISLQWLRRIQ NYSVMDICDY DESSGRWNCL VARQRTEMST TGWVGRFRPS 360 

EPHFTLDGNS FYKIISNEEG YRHICYFQID KKDCTFCTKG TWEVIGIEAL TSDYLYYISN 420 

EYKGMPGGRN LYKIQLIDYT KVTCLSCELN PERCQYYSVS FSKEAKYYQL RCSGPGLPLY 480 

TLHSSVNDKG LRVLEDNSAL DKMLQNVQMP SKKLDFIILN ETKFWYQMIL PPHFDKSKKY 540 

PLLLDVYAGP CSQKADTVFR LNWATYLAST ENIIVASFDG RGSGYQGDKI MHAINRRLGT 600 

FEVEDQIEAA RQFSKMGFVD NKRIAIWGWS YGGYVTSMVL GSGSGVFKCG IAVAPVSRWE 660 

YYDSVYTERY MGLPTPEDNL DHYRNSTVMS RAENFKQVEY LLIHGTADDN VHFQQSAQIS 720 
KALVDVGVDF QAMWYTDEDH G1ASSTAHQH IYTHMSHFIK QCFSLP 

SEQ ID N0:225 PBJ2 DNA SEQUENCE 

Nucleic Add Accession!: none found 

Cooing sequence: 1-261 (underilned sequences conespond to start and stop codons) 

1 11 21 31 41 51 

I I I I . I I 
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ATGGCTCTGG CGAAGGTGAG GGAGCCAAAC GCAAATGACA ATGCCATCAG AGTTGACAAC 60 

AGAAGTGTGA TTAAAGTGCG TGCTAACCAG TGTTCCCTGC ATGAGGCAGA AAGTGAATCC 120 

AGAAACCCTC AGGAGCTCTG GATGGGCCTG CTCCTCTTGA TGGGGGTCCT AGAAGCATGT 180 

GTGGAAATGA GGCCTCTGTC AGTCTGGTCC CTGAGAGATG ACAAGGAGCA GAGCCCCCAC 240 
CAGCCCACAC TGGATGTCTA A 



SEQ ID NO;22g Pftft Eajgfl sequgMe; 

Protein Accession ft none found 

1 11 21 31 41 51 

I I I I I I 

KALAKVREPN ANDNAIRVDN RSVIKVRANQ CSLKEAESES RNPQELWMGL LLLMGVLEAC 60 
VEMRPLSVWS LRDDKEQSPH QPTLDV 

SEQ !0 N0227PBM2 DNA SEQUENCE 

Nucleic Acid Accession «: none found 

Cooing sequence: 1 -462 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGCCAAATG CTGAGTTAGA AGCAAAGAGC CTTGGAAGCA GTAAATGTTT AAAAACTGCT 60 

CTCATACTTG CTGTATGTTG TGGATCAGCA AATATAGTCA GCCCTCTACT TGAGCAAAAT 120 

ATTGATGTAT CTTCTCAAGA TCTGGACAGA CGGCCAGAGA GTATGCTGTT TCTAGTCATC 180 

ATCATGTGGA CCAGTTTTGT GGAAGACAAT CTTTCCATGG GCTGGGGGAA GCTAGAAGAT 240 

TTTATGGCTA TTGAAGAAGA AATGAAGAAG CACGGAAGTA CTCATGTGGG ATTCCCAGAA 300 

AACCTGACTA ATGGTGCCGC TGCTGGCAAT GGTGATGATG GATTAATTCC TCCAAGGAAG 360 

AGCAGAACAC CTGAAAGCCA GCAATTTCCT GACACTGAGA ATGAAGAGTA TCACAGGTTT 420 
GTCAAAGATC AGATAGTTGT AGATATGCGG CGTTATTT CT GA 

SEQ P N0328 PBM2 Protein sequence: 

Protein Accession!: none found 



1 n 21 31 41 51 

I I I I 1 I 

KPNAELEAKS LGSSKCLKTA LIIAVCCGSA NIVSPLLEQN IDVSSQDLDR RPESMLFLVI 60 
IMWTSFVEDN LSMGWGKLED FMAIEEEMKK HGSTHVGFPE NLTNGAAAGN GDDGLIPPRK 120 
SRTPESQQFP DTENEEYHRF VKDQIVVDMR RYF 

SEQ 10 NO:229 PEZ2 DNA SEQUENCE 

Nucleic Add Accessions NM.014253 

Cooing sequence: 65-8242 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GACTGCTTGC ATTAAAGGAC TTCCTCATCC TTTTTTTCAT GAAACTGAGC TTGCTTAATC 60 

AGAGATGGAG CAAACTGACT GCAAACCCTA CCAGCCTCTA CCAAAAGTCA AGCATGAAAT 120 

GGATCTAGCT TACACCAGTT CTTCTGATGA GAGTGAAGAT GGAAGAAAAC CAAGACAGTC 180 

ATACAACTCC AGGGAGACCC TGCACGAGTA TAACCAGGAG CTGAGGATGA ATTACAATAG 240 

CCAGAGTAGA AAGAGGAAAG AAGTAGAAAA ATCTACTCAA GAGATGGAAT TCTGTGAAAC 300 

CTCTCACACT CTGTGCTCTG GCTACCAAAC AGACATGCAC AGCGTTTCTC GGCATGGCTA 360 

CCAGCTAGAG ATGGGATCTG ATGTGGACAC AGAGACAGAA GGTGCTGCCT CACCTGACCA 420 

TGCACTAAGA ATGTGGATAA GGGGAATGAA ATCAGAGCAT AGTTCCTGTT TGTCCAGCCG 480 

GGCCAACTCT GCATTATCCT TGACTGACAC TGACCATGAA AGGAAGTCTG ATGGGGAAAA 540 

TGGTTTCAAA TTCTCTCCTG TTTGTTGTGA CATGGAGGCT CAAGCTGGGT CTACTCAAGA 600 

TGTGCAGAGC AGCCCACACA ACCAGTTCAC CTTCAGACCC CTCCCACCGC CACCTCCGCC 660 

TCCTCATGCC TGCACCTGTG CCAGGAAGCC ACCCCCTGCA GCGGACTCTC TTCAGAGGAG 720 

ATCAATGACT ACCCGCAGCC AGCCCAGCCC AGCTGCTCCA GCTCCCCCAA CCAGCACGCft 780 

GGATTCAGTC CATCTGCATA ACAGCTGGGT CCTGAACAGC AACATACCAT TGGAGACCAG 840 

GCATTCCCTG TTCAAACATG GATCTGGTTC CTCTGCGATC TTCAGTGCAG CCAGTCAGAA 900 

CTACCCTCTG ACATCCAATA CCGTGTACTC GCCCCCTCCC AGGCCTCTTC CTCGAAGCAC 960 

CTTTTCCCGA CCTGCCTTTA CCTTTAACAA ACCTTACAGG TGCTGCAACT GGAAGTGCAC 1020 

AGCATTGAGC GCCACTGCAA TCACAGTGAC TTTGGCCTTG TTACTAGCCT ATGTGATTGC 1080 

AGTGCATTTG TTCGGCCTGA CTTGGCAGTT GCAACCAGTT GAAGGAGAGC TGTATGCAAA 1140 

TGGAGTTAGC AAAGGGAACA GGGGGACCGA GTCCATGGAC ACTACTTACT CTCCAATTGG 1200 

AGGAAAAGTT TCTGATAAAT CAGAGAAAAA AGTGTTTCAG AAGGGACGGG CGATAGACAC 1260 

TGGAGAAGTT GACATTGGTG CACAGGTCAT GCAGACCATT CCACCTGGTT TATTCTGGCG 1320 

TTTCCAGATT ACTATCCACC ATCCAATATA TCTGAAGTTC AATATTTCTT TAGCCAAGGA 1380 

CTCTCTGCTG GGAATTTATG GCAGAAGAAA CATTCCACCT ACACATACTC AGTTTGATTT 1440 

TGTAAAACTA ATGGATGGCA AACAGCTGGT CAAGCAGGAC TCCAAGGGCT CTGATGATAC 1500 

ACAGCACTCC CCTCGGAACC TGATCTTAAC TTCGCTTCAG GAGACAGGTT TCATAGAGTA 1560 

TATGGATCAA GGACCTTGGT ATCTGGCGTT TTACAATGAT GGAAAAAAGA TGGAGCAAGT 1620 

ATTCGTGTTA ACTACAGCAA TTGAAATAAT GGATGACTGT TCAACCAATT GCAATGGAAA 1680 

TGGAGAGTGT ATCTCTGGCC ATTGTCATTG TTTCCCAGGA TTCCTTGGAC CTGACTGTGC 1740 

TAGAGATTCC TGCCCTGTGC TGTGTGGTGG GAATGGAGAA TACGAGAAAG GACACTGTGT 1800 

CTGCCGGCAT GGCTGGAAGG GGCCAGAGTG TGACGTTCCG GAAGAACAAT GCATTGATCC 1860 

AACATGCTTT GGCCACGGCA CCTGCATCAT GGGAGTCTGC ATCTGTGTGC CAGGATACAA 1920 
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AGGAGAAATA TGCGAGGAAG AGGACTGCCT AGACCCAATG TGTTCCAACC ATGGCATCTG 1980 

TGTAAAAGGA GAATGTCACT GTTCTACTGG CTGGGGAGGA GTTAACTGTG AAACACCACT 2040 

TCCTGTATGT. CAAGAGCAGT GCTCAGGACA CGGAACTTTT CTTCTGGACG CTGGAGTATG 2100 

CAGCTGTGAT CCCAAGTGGA CAGGATCTGA CTGCTCAACA GAGCTGTGTA CCATGGAGTG 2160 

TGGTAGCCAT GGAGTCTGCT CAAGAGGAAT TTGCCAGTGT GAAGAAGGCT GGGTAGGACC 2220 

AACATGTGAG GAACGCTCCT GTCATTCTCA TTGTACTGAG CATGGCCAAT GCAAAGATGG 2280 

AAAATGTGAG TGTAGCCCTG GATGGGAGGG CGACCACTGC ACAATTGCTC ACTACTTAGA 2340 

TGCTGTCCGA GATGGCTGCC CAGGGCTCTG CTTTGGAAAT GGACGATGTA CCCTGGATCA 2400 

AAATGGTTGG CACTGTGTGT GTCAGGTGGG TTGGAGTGGG ACAGGCTGCA ATGTTGTCAT 2460 

GGAAATGCTT TGTGGAGATA ACTTGGACAA TGATGGAGAT GGTTTAACCG ACTGTGTGGA 2520 

TCCTGACTGT TGTCAACAAA GCAACTGTTA TATAAGTCCT CTCTGCCAGG GCTCACCAGA 2580 

TCCTCTTGAC CTCATTCAGC AAAGCCAAAC TCTCTTCTCT CAGCACACTT CAAGACTTTT 2640 

TTATGATCGA ATCAAATTCC TCATTGGCAA GGACAGTACT CATGTCATTC CTCCTGAGGT 2700 

GTCATTTGAC AGCAGGCGTG CCTGTGTGAT TCGAGGCCAA GTGGTGGCCA TAGATGGAAC 2760 

TCCTCTAGTG GGAGTGAATG TCAGTTTCTT GCACCACAGT GATTATGGGT TTACCATCAG 2820 

CCGGCAAGAT GGAAGCTTTG ACCTCGTGGC CATCGGTGGC ATCTCTGTCA TCTTAATCTT 2880 

CGACCGATCC OCTTTCCTGC CTGAGAAGAG AACACTCTGG TTGCCTTGGA ATCAGTTTAT 2940 

TGTGGTAGAG AAAGTCACCA TGCAGAGAGT TGTATCAGAC CCGCCATCCT GCGATATCTC 3000 

CAACTTTATC AGCCCAAACC CTATTGTGCT TCCTTCACCG CTCACATCAT TTGGAGGGTC 3060 

CTGTCCAGAG AGGGGAACTA TTGTTCCTGA GCTGCAGGTT GTACAGGAGG AAATTCCCAT 3120 

TCCCTCCAGC TTTGTGAGGC TGAGTTACCT GAGCAGCCGC ACCCCTGGGT ATAAAACCCT 3180 

GCTACGGATC CTTCTGACAC ATTCAACGAT TCCCGTAGGC ATGATAAAAG TACACCTCAC 3240 

AGTAGCTGTG GAAGGGCGAC TCACACAGAA GTGGTTTCCC GCCGCAATTA ATCTTGTCTA 3300 

CACATTTGCT TGGAACAAGA CCGATATCTA TGGACAGAAG GTTTGGGGCC TGGCAGAGGC 3360 

TTTGGTATCT GTGGGATATG AATATGAAAC GTGCCCTGAC TTTATTCTCT GGGAGCAAAG 3420 

GACAGTCGTT TTACAAGGTT TTGAGATGGA TGCTTCTAAC CTAGGAGACT GGTCTTTGAA 3480 

TAAGCATCAC ATTTTGAATC CTCAAAGTGG AATCATACAT AAAGGGAATC GAGAAAATAT 3540 

GTTCATTTCC CAGCAGCCCC CAGTCATATC AACCATAATG GGTAATGGAC ACCAAAGGAG 3600 

TGTAGCCTGC ACCAACTGCA ATGGCCCAGC CCACAACAAC AAACTCTTTG CTCCTGTCGC 3660 

CTTAGCTTCT GGCCCTGATG GCAGTGTGTA TGTTGGCGAC TTCAATTTTG TAAGGAGAAT 3720 

ATTTCCCTCG GGAAACTCCG TTAGTATTTT GGAATTAAGC ACAAGTCCTG CTCACAAATA 3780 

CTATCTGGCT ATGGACCCTG TGTCTGAATC ACTCTATCTA TCAGACACCA ATACTCGCAA 3840 

AGTCTACAAG TTGAAATCTC TTGTGGAGAC GAAAGATCTG TCCAAGAATT TTGAAGTGGT 3900 

GGCAGGAACT GGTGATCAGT GCCTTCCCTT TGACCAGAGT CATTGTGGAG ATGGTGGGAG 3960 

AGCATCGGAA GCTTCACTGA ATAGCCCTCG AGGCATCACA GTTGATAGGC ATGGATTTAT 4020 

TTACTTTGTG GATGGGACTA TGATTCGCAA AATTGATGAG AATGCTGTGA TCACAACTGT 4080 

AATCGGCTCA AATGGTCTGA CTTCCACACA ACCACTGAGC TGTGACTCAG GAATGGACAT 4140 

CACTCAGGTG CGATTAGAGT GGCCAACAGA OCTTGCAGTA AATCCTATGG ACAATTCATT 4200 

GTATGTCTTG GATAACAACA TTGTGCTGCA AATTTCTGAG AACAGGCGTG TTCGGATCAT 4260 

CGCAGGACGC CCCATTCACT GCCAGGTGCC AGGCATCGAT CATTTCCTGG TCAGCAAGGT 4320 

AGCAATTCAC TCCACTCTAG AGTCAGCGAG GGCCATCAGT GTCTCCCACA GCGGGCTGCT 4380 

CTTCATAGCT GAAACAGACG AGAGGAAAGT AAACCGCATT CAGCAAGTAA CCACCAATGG 4440 

GGAGATCTAC ATCATCGCTG GTGCCCCCAC TGACTGTGAC TGCAAAATTG ATCCAAACTG 4500 

TGACTGTTTT TCAGGTGATG GTGGCTATGC CAAAGATGCA AAGATGAAAG CCCCTTCCTC 4560 

CTTAGCAGTG TCGCCTGATG GAACCCTCTA TGTGGCAGAC CTCGGAAATG TTCGAATTCG 4620 

TACCATCAGC AGGAACCAAG CCCACCTGAA TGACATGAAC ATTTATGAGA TTGCTTCACC 4680 

CGCTGATCAG GAACTGTACC AGTTCACTGT AAATGGAACC CACCTACACA CCCTGAACTT 4740 

GATAACAAGG GACTATGTTT ATAACTTCAC CTACAATTCT GAAGGTGACT TGGGCGCGAT 4800 

TACCAGCAGC AATGGCAATT CAGTGCACAT TCGCCGTGAT GCAGGCGGAA TGCCGCTATG 4860 

GCTTGTGGTG CCTGGCGGAC AAGTATACTG GCTGACTATA AGCAGCAATG GAGTCCTGAA 4920 

AAGAGTGTCA GCCCAAGGCT ATAATCCGGC CTTAATGACC TATCCAGGAA ACACAGGGCT 4980 

TCTGGCTACC AAAAGTAACG AAAATGGATG GACAACCGTT TATGAGTATG ACCCCGAGGG 5040 

ACACCTGACC AATGCAACGT TTCCCACTGG AGAGGTCAGC AGCTTCCACA GTGACCTGGA 5100 

GAAGCTGACA AAAGTGGAGC TAGATACTTC CAACCGTGAA AATGTCCTCA TGTCAACCAA 5160 

CTTGACGGCA ACTAGTACCA TATATATTTT AAAACAAGAA AATACTCAAA GTACCTATCG 5220 

GGTGAATCCA GATGGTTCCC TGCGTGTCAC TTTTGCCAGC GGGATGGAGA TCGGCCTCAG 5280 

CTCAGAGCCC CACATCCTGG CAGGGGCAGT CAACCCTACC CTGGGCAAAT GCAACATCTC- 5340 

ATTGCCCGGA GAGCACAATG CAAACCTCAT CGAGTGGCGG CAGAGGAAGG AGCAAAACAA 5400 

AGGCAATGTT TCGGCTTTTG AAAGGAGGCT GAGGGCCCAC AACAGAAACC TACTCTCCAT 5460 

AGATTTTGAT CATATAACCC GCACAGGAAA GATCTATGAT GACCATCGAA AATTCACCCT 5520 

TCGAATTCTT TATGACCAGA CTGGGCGACC CATTCTGTGG TCTCCTGTAA GCAGATATAA 5580 

TGAAGTGAAC ATCACATATT CACCTTCGGG ATTGGTGACG TTTATTCAAA GAGGAACGTG '5640 

GAATGAAAAA ATGGAATATG ACCAGAGTGG GAAAATTATT TCAAGAACTT GGGCTGATGG 5700 

GAAAATTTGG AGCTATACCT ACTTAGAAAA ATCTGTGATG CTTCTCCTAC ACAGCCAGCG 5760 

GCGTTACATC TTTGAGTATG ACCAATCAGA TTGCCTGCTG TCAGTTACCA TGCCTAGCAT 5820 

GGTGCGCCAC AGCTTACAAA CCATGCTTTC AGTGGGCTAC TACCGTAATA TCTACACCCC 5880 

ACCGGACAGT AGCACTTCTT TTATCCAAGA CTATAGTCGA GATGGCCGAT TGCTACAGAC 5940 

CCTGCATCTG GGGACAGGGC GCAGAGTCTT ATACAAGTAC ACCAAGCAAG CAAGGCTTTC 6000 

TGAGGTTCTC TATGATACCA CTCAGGTCAC ATTAACATAT GAAGAGTCTT CTGGAGTGAT 6060 

TAAGACAATA CACCTGATGC ATGACGGATT CATCTGCACA ATCAGATACA GGCAAACAGG 6120 

ACCTCTTATT GGACGCCAGA TTTTCAGATT CAGTGAAGAA GGCCTTGTGA ATGCACGGTT 6180 

CGACTACAGC TACAACAATT TCCGAGTCAC AAGCATGCAA GCTGTAATCA ATGAAACCCC 6240 

TTTGCCTATA GATCTTTACC GATATGTTGA TGTCTCTGGC AGAACAGAGC AGTTTGGAAA 6300 

ATTCAGTGTA ATTAATTACG ATTTAAATCA GGTCATAACT ACTACAGTGA TGAAACACAC 6360 

CAAAATCTTC AGTGCCAATG GACAAGTCAT TGAAGTCCAA TATGAAATCC TAAAGGCAAT 6420 

TGCCTACTGG ATGACCATTC AATATGATAA TGTGGGCCGA CATGGTAATA TGTGCATAAG 6480 

GGTAGGAGTA GATGCCAATA TAACAAGGTA CTTCTATGAA TACGATGCTG ATGGGCAACT 6540 

TCAGACTGTT TCTGTAAATG ACAAAACCCA GTGGCGTTAT AGTTACGATC TGAATGGAGA 6600 

CATCAACCTC TTAAGCCATG GGAAGAGTGC TCGTCTTACT CCTCTCCGAT ATGACCTCCG 6660 

AGACCGCATC ACCAGATTAG GAGAAATTCA GTATAAAATG GATGAAGATG GCTTTCTGAG 6720 
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GCAGAGGGGA AATGATATTT TTGAATATAA TTCTAATGGC CTGCTGCAGA AAGCCTACAA 6780 
TAAGGCTTCT GGCTGGACTG TGCAGTATTA CTATGATGGG CTTGGGCGAC GTGTCGCGAG 6840 
TAAGTCCAGC CTAGGGCAGC ACCTTCAGTT CTTTGTCGAC GCGACCGCGA ACCCCATAAG 6900 
AGTTACTCAT TTGTACAACC ACACAAGCTC GGAGATTACA TCTCTGTATT ATGATCTCCA 6960 
AGGTCACCTT ATTGCCATGG AGTTAAGCAG TGGTGAAGAA TATTATGTAG CCTGTGATAA 7020 
TACAGGTACC CCACTAGCTG TGTTCAGCAG CCGAGGTCAG GTCATAAAGG AGATACTATA 7080 
CACACCTTAT GGCGATATCT ATCATGACAC TTACCCTGAC TTTCAGGTCA TAATTGGTTT 7140 
TCATGGAGGA CTCTATGATT TCCTTACTAA ATTAGTGCAC CTGGGGCAAA GGGATTATGA 7200 
TGTTGTTGCT GGCAGATGGA CAACGGCCTA TCATCACATA TGGAAACAGT TGAACCTCCT 7260 
TCCTAAACCA TTCAACCTCT ACTCCTTTGA AAATAACTAC CCAGTTGGCA AAATTCAAGA 7320 
TGTTGCAAAG TATACCACAG ACATCAGAAG TTGGTTGGAG CTATTTGGTT TCCAATTACA 7380 
CAATGTACTA CCTGGATTTC CCAAACCTGA ATTAGAAAAT TTAGAATTAA CTTACGAGCT 7440 
TCTACGGCTT CAGACAAAAA CTCAAGAGTG GGATCCTGGA AAGACTATCC TGGGCATTCA 7500 
GTGTGAACTC CAGAAACAGC TCAGGAATTT CATTTCCTTG GACCAACTAC CTATGACTCC 7560 
CCGATACAAT GATGGACGGT GCCTTGAAGG AGGGAAGCAA CCAAGGTTTG CTGCTGTCCC 7620 
TTCTGTTTTT GGGAAAGGTA TAAAATTTGC CATCAAGGAT GGCATAGTAA CAGCTGATAT 7680 
TATAGGAGTA GCCAATGAAG ATAGCAGGCG GCTTGCTGCC ATTCTCAATA ATGCCCATTA 7740 
CCTGGAAAAC CTACATTTTA CCATAGAGGG GAGGGACACT CACTACTTCA TTAAGCTTGG 7800 
GTCTCTGGAG GAAGACCTGG TGCTCATCGG TAACACTGGG GGGAGGCGGA TTCTGGAGAA 7860 
TGGTGTCAAT GTCACTGTGT CCCAGATGAC TTCTCTGTTG AATGGGAGGA CTAGACGGTT 7920 
TGCAGATATT CAGCTCCAGC ATGGAGCCCT GTGCTTCAAC ATCCGGTATG GGACAACTGT 7980 
CGAAGAGGAA AAGAATCACG TGTTGGAGAT TGCCAGACAG CGCGCAGTGG CCCAGGCCTG 8040 
GACTAAGGAA CAAAGAAGGC TGCAAGAGGG GGAAGAGGGG ATTAGGGCAT GGACAGAAGG 8100 
GGAAAAGCAG CAGCTTTTGA GCACTGGGCG GGTACAAGGT TACGATGGGT ATTTTGTTTT 8160 
GTCTGTTGAG CAGTATTTAG AACTTTCTGA CAGTGCCAAT AATATTCACT TTATGAGACA 8220 
GAGCGAAATA GGCAGGAG GT AA CAAAAATA TCTCTGCCTT TGCGTCACCA AAGACTGCCT 8280 
GTTTTTAAAA CATAAAATGG TTTATTGTAT TGGTTTTCTA GATCAGAACT CTGTATATGT 8340 
AAATATGGAG GAAAAACATA TCCAACTGCC TTTCAATGTG ACGGAAGATG GTATTTTAAT 8400 
ATTGTTTGTT TAAACTCTTT AAGAAATGAC AGAGATTTTT AGTTCTTGTG TGGCAGTATT 8460 
CAAAATAACA CAAGTAGAAC TCAAACAGCT AAAAACAGTT TTCAGAAAGC ACCACTTTCA 8520 
ATTTGCCGAG CCATGCATAT GTTCCAATAT CCAGAAAGAA CCCAAGGTTC TCTATCTCTA 8580 
TTGTGAGAAG CAGTTTCATC CTTAACTGTT GGCAGAACTT ACGGGCTATT TGAATAGGTG 8640 
GTGCAATAGT ATCTGAAACT TGCCTTTCGA AAGACTGCCA GCCCTTTGAC GTTTTCCAGA 8700 
TCTGTTATAG GAAACTTAAA AACAGGTGTA AAATGTCTTC AGCCACCATC TCCTAGAGTG 8760 
AGGACCCAAT TGCCCTTCCT TCTTGATTAT TCCTCCTTGC TTGTTAAAGT AAATGCCATA 8820 
TTGTTGTGCT GTGTTTTGGC GTGTGGTGGC TGGGTTCTGT CTAOCATGCT TCCCTGTGGG 8880 
TGTGGTAACC AGACTGTATA GCCGCTATTT GCTCGTGTGT ACATGATACC AAAGCAGCTG 8940 
GCCAGCGTGA CCTCTCTCAC ACGACCTGTT TTGACTCAAT TTTTTACTAA AAGTTGTTCA 9000 
GCTGTATTGG TATCATGTAA ACATAGCTTT TATTAACCTG GGTAGGAATT TCTCATTTAT 9060 
ATATAGGATG TGTTTTGGTC ATAGTTTCAC ATTAGTGATT CAGTATCTAT ACACTGACCC 9120 
AATGGTTTTG TGCACATGAA CGGTAATTTA CTTAAAAGTA TGATTCTGGT ACAAAAACAA 9180 
ACAAAGGCTT TAGCAGGCAT ACGTGTCTGG GATGCCGATA CATACATTAA CTACTACTGC 9240 
AGAAATTCAT AAGAGCCAAA ACCTTAAAAA AATAGACCTG GTACTTAAGT GAAAGTACTA 9300 
AAGGGAAGAC CAGACCAAAC ATCACAGCAG TTGCTGCCAC ATTGTTTCAG CCCACTTAGA 9360 
TTTATCTTTC AAATGTACAA TTCTGTATTG AACATCTCCC AGCCATCTTC AGGAAATCGA 9420 
ATCAAGTAAA TCCTTTCCAA CCGAAAACAT TTCAACTAAC TATAGAGAGG CAGACTCATT 9480 
TTTACTAAAA TAATTTATAC AGTTAGTTAT TTTCGTTCTC CGTACTTACC CATTTATCTT 9540 
TATTTAATCG TCTCTACTGC CTAGGAAAAT AACTATTTTC CAGGACGGGT TATTTGTTCT 9600 
GCGATCATTT AAAATTTGGA GAAAGGTCAG GATTAGTGTT AATATCAGCT GCAGTTTCTC 9660 
AATCTCTAGG AATCCTGCAG TAAAACAAGC CCCTTGGTGA GCTGGAAGAT TTGTGCCCAG 9720 
TGACAAAGAG ATAGTTTGTA AAATGCTGTG TAATTGTAAG TTACCACAAA TGAAAATACA 9780 
TGACAGCACA ATGTGGCCCG TAGAAAATTC CCCTGAGCCA GCTTCTGCAC TTTCATCACC 9840 
GAATCTGAAC ATTTGCTATG TCTGAAGGCA AATTTATGAT GGAATGTTAG TTTGGATTCT 9900 
TTCCAGATGC TACCTAAATG CAGTGTGGGG TCATTGCCTT GCTTTGCGAT GACAGTTTCT 9960 
TTGAAAATAT GCAAAGTCAT AAGCTCATGT TAAGGTTTTT CAAGAGTCTG CCTCCTACTA 10020 
CACAAAGGAA AGCAAGGGAA AGGAAATGAC CCTGGCAAAC AGTAGGGAAG GGTGTATTCA 10080 
AACATTTCAT TTTCAAAACC TTCGGGTTAG AATACCACTT ACACATGTAT TCTGAGAGAC 10140 
AGAATTCATG AGGAACTCAT CTCTCTTTAT AACTGGAAAC ACACCAGCTT GATATATTGC 10200 
TAATCCATAC TAAAATCATA TTATTGGGTT TTTTCTGAAT CAGGCCTGTA TTAATGGTAC 10260 
AGTATTTATT CAGAATGGAA TTCTAAAATT ACTAACAAAC TTGTTGAAAA TTTGAATACC 10320 
TCCACACCAA CCTAAAAATG GACCTTAAGT TCCTAGAACC TCTGATGTTC TTTTAAATTA 10380 
ATGGAAAAAT AATTTGTGAA CTGTATATAG AGAGTGCATT CATAAATGTG ATTATGTATT 10440 
TTATCACAAA TCCAAAATGT CAATATTAGA GTCTATTTTG CTTATATTTT AAGCAATTAT 10500 
ACGTTTTTGC AATTCATTGA TGATGTATCA TTTTCAAACT GCTTTAAATA TCCATTAGAA 10560 
ACAAATATTT GAAGCTTTTA CTTAATAGTG ATTACCTTGA ACTGTGCATT TCTAGTTTGT 10620 
AATACGTATT TGGTTGGTTC GTGCCTTTAG TTTGTTAAAG TTACATTTGT ATTATATTCA 10680 
GGAAATGCAC TTTTTATTAC TTACAGCTGT GGTTTTAATA CTGCCTTGAA CTATTATTAT 10740 
TCTTTTTACA ACTCCTAAAG CTTGAGGGAG GAAAGAAAAA AAAAACAAAA CTACTAATCA 10800 
GTAGTAAATC GAAGAGAAAC ATTTTGGCAT TTCTTAAGAA GAAGATGGAG ATATTGAGTA 10860 
TATCACTTCC TATTCAGCTG AATAGAAAGA ATGCCTTCAT TGACTTGCAG TTCTGCAGTT 10920 
TAAATTATTG AAAGAACAAT TCGTTTGCAT TTCCTGATGA AAGTAAAAGC ATTTTTCAGA 10980 
GAAACATATG AATTTCTCAT ACCCAGCAGA CAGATGGCTG ACACTGCACA GCCACACACC 11040 
ATTCGAGTAA GTTAAAGTGA GAGCATAGTA GTTGGACTCT CCTATGAAGA ACATTCTGGG 11100 
CTGGAGGCAG GGAATACTCC ATGGTTGTTT CTTTTTCCTA CTTAAGCCCA TTTTGTTTGT 11160 
GCTTTTCTGT TTTGTTTTGT TTTCACTCTT GCACTACAGT CTAGAGATCC AAATGAACTG 11220 
AAAAGTTCAA AGTTTAACAC ATTTAAATAT GTTTACTTTT AGTTGTCATT CTAATCGTTA 11280 
TTGATTAGAA GCATGACTCC TGAAGGAAAG GGAAATAAAT CTCAATTCAT ACTAACTTGC 11340 
AACAAAACAC TTTTACCATA TAAATAAGTA TATGATTTAT TTTTAACCCA AAAAATGTAT 11400 
AAAATAAGTG TGTCXTTTAC TGTCAATTTA TCGAGAAGAT CTATAATATA TAGACTACAT 11460 
ATATATAATA TATACAACAT AGCCAAATGT ATGAAAACTT GACAATGTAT AATTTGGAAT 11520 
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TCACATGCTA CCTATGTAGA CAGGTATGAA ATTAAGTTAT AATTTTCATG AGACATTTTC 11580 
ATCACTGTTG ACACAGTTTC AAGGCATTCC ATCATGTTAT TTTGACTCTT TTTCTTTTTT 11640 
TTTTCTTTAA AAATATATTT TTAACTAGAC CAGGCCCCAC TATAATATCA CTTAAGAGAG 11700 
TCAGGGCAAA GTTTTTGCAT TTATGAAGAT GTGTTCATGT AAGGGTGATT GTAATGGAGT 11760 
TCATTGGTAA TAGAAGCAAA AGTACAGTAA CGAAGTATTG AAAAGAAAAT TTTGGAGACA 11820 
TTGGAGCATA TTATATATAG CTTGTGGAAA GACATAAGGC TACAGATGGA ATGGAACATT 11880 
CCTGTTTTCT TGAAGAAATT CACATACACA TAGCTGACCT GACTAGTACT TCAGCTCTTC 11940 
CACAGCCTTC TATAAAGGTT CTTTCTTCTG CAAAGAAAAC AAAACAAAAC AAAACAAAAC 12000 
AAAAAAAAAC AAAAAAAGCG CAAAAAACAA AAAAACAAAA AAAAGCAAAG TAAAATTTAA 12060 
AAATACAGAA AACAAACAAC AAAAAAGAAT TCAACCATAA ATAGTGACTA TTATTTTCAG 12120 
TGTGTCCTTC ATGTGAAAGC TATTAAGGAC CAAATATACT ACTGTTCATA AGAAGAAATT 12180 
ACTTTCTAAA CAGTAACTGA AAATACTTAG AGTTAAACTT GCTGTGGATT TTGTCTTGGC 12240 
AGTTGTCATC TTACATTATT TGTCAAAGGA AATGTGTTTG GCAGTTAAAA ATCTTTCCTT 12300 
AGATTTAGTG GTGGACTTTA ACCTCTTAAA TAAATGTTAG TATATCAGAT TGTGTCCTTG 12360 
AAAAATATTT TACTTGTATG AATCATGACA ACGTCTAAAT CTTTACTATT CTTCTGGCAA 12420 
AAGCATCAGT AAGAAAGAAG GCGAAAAAGA GAAGTATAGC CTTTATGTCA GAAAAACATT 12480 
CTTTTTAGCT GCTTACTTTC TCATGAAAAG TAAAGATGTT TACAGTGTAT GCCAAGTTTT 12540 
CAGTTTCTGT ATAACAACAG GTAGAGGTTC TAATCATATT GAAAATTGTG TTATAATGGT 12600 
CTGAGCCATG TTGCTAGGAA ACAATAGGTT CCAATTTTGT ATTCCTGCTC TCCTGTGCTG 12660 
AAAAGTGACT GGATACTGTA CAGGTTCATG TTCTCTGGCT GCAGTTAAAT GGTCTTTTGC 12720 
ATTTTGCTCT GGCTTTCAGG CCAGAAGCAT GCATTTTTCT ACAAGAGCAT CACAACAACA 12780 
TGCTGTAAAT ATTTAAAGTT AAACATTATG TGTTGATATT TGAAAGAAAA GTACTTTGAA 12840 
TATTTCATTT TTAAAAAATA AAATTGCCAA TGAAAAAAAA 



Protein Accession #: NP.055068 



1 11 21 31 41 51 

I I I I I I 

MEQTDCKPYQ PLPKVKHEMD LAYTSSSDES EDGRKPRQSY NSRETLHEYN QELRMNYNSQ 60 

SRKRKEVEKS TQEMEFCETS HTLCSGYQTD MHSVSRHGYQ LEMGSDVDTE TEGAASPDHA 120 

LRMWIRGHKS EHSSCLSSRA NSALSLTDTD HERKSDGENG FKFSPVCCDM EAQAGSTQDV 180 

QSSPHNQFTF RPLPPPPPFF HACTCARKPP PAADSLQRRS MTTRSQPSPA APAPPTSTQD 240 

SVHLHNSWVL NSNIPLETRH SLFKHGSGSS AIFSAASQNY PLTSNTVYSP PPRPLPRSTF 300 

SRPAFTFNKP YRCCNWKCTA LSATAITVTL ALLLAYVTAV HLFGLTWQLQ PVEGELYANG 360 

VSKGNRGTES MDTTYSPIGG KVSDKSEKKV FQKGRAIDTG EVDIGAQVMQ TIPPGLFWRF 420 

QITIHHPIYL KFNISLAKDS LLGIYGRRNI PPTHTQFDFV KLMDGKQLVK QDSKGSDDTQ 480 

HSPRNLILTS LQETGFIEYM DQGPWYLAFY KDGKKMEQVF VLTTAIEIMD DCSTNCNGNG 540 

ECISGHCHCF PGFLGPDCAR DSCPVLCGGN GEYEKGHCVC RHGWKGPECD VPEEQCIDPT 600 

CFGHGTCIMG VCICVPGYKG EICEEEDCLD PMCSNHGICV KGECHCSTGW GGVNCETPLP 660 

VCQEQCSGHG TFLLDAGVCS CDPKWTGSDC STELCTMECG SHGVCSRGIC QCEEGWVGPT 720 

CEERSCHSHC TEHGQCKDGK CECSPGWEGD HCTIAHYLDA VRDGCPGLCF GNGRCTLDQN 780 

GWHCVCQVGW SGTGCNWME MLCGDNLBND GDGLTDCVDP DCCQQSNCYT SPLCQGSPDP 840 

LDLIQQSQTL FSQHTSRLFY DRIKFLIGKD STHVTPPEVS FDSRRACVTR GQWAHX3TP 900 

LVGVNVSFLH KSDYGFTISR QDGSFDLVAI GGISVILIFD RSPFLPEKRT LWLPWNQFIV 960 

VEKVTMQRW SDPPSCDISN FISPNPIVLP SPLTSFGGSC PERGTIVPEL QWQEEIPIP 1020 

SSFVRLSYLS SRTPGYKTLL RILLTHSTIP VGMIKVHLTV AVEGRLTQKW FPAAINLVYT 1080 

FAWNKTDIYG QKVWGLAEAL VSVGYEYETC PDFILWEQRT WLQGFEHDA SNLGDWSLNK 1140 

HHILNPQSGI IHKGNGENMF ISQQPPVIST IMGNGHQRSV ACTNCNGPAH NNKLFAPVAL 1200 

ASGPDGSVYV GDFNFVRRIF PSGNSVSILE LSTSPAHKYY LAMDPVSESL YLSDTNTRKV 1260 

YKLKSLVETK DLSKNFEWA GTGDQCLPFD QSHCGDGGRA SEASLNSPRG ITVDRHGFIY 1320 

FVDGTWIRKT DENAVITTVI GSNGLTSTQP LSCDSGMDIT QVRLEWPTDL AVNPMDNSLY 1380 

VLDNNIVLQI SENRRVRIIA GRPIHCQVPG IDHFLVSKVA IHSTLESARA ISVSHSGLLF 1440 

IAETDERKVN RIQQVTTNGE IYIIAGAPTD CDCKIDPNCD CFSGDGGYAK DAKMKAPSSL 1500 

AVSPDGTLYV ADLGNVRIRT ISRNQAHLND MNIYEIASPA DQELYQFTVN GTHLHTLNLI 1560 

TRDYVYNFTY NSEGDLGAIT SSNGNSVHIR RDAGGKPLWL WFGGQVYWL TISSNGVLKR 1620 

VSAQGYNPAL MTYPGNTGLL ATKSNENGWT TVYEYDPEGH LTNATFPTGE VSSFHSDLEK 1680 

LTKVELDTSN RENVLMSTNL TATSTIYILK QENTQSTYRV NPDGSLRVTF ASGMEIGLSS 1740 

EPHILAGAVN PTLGKCNISL PGEHNANLIE WRQRKEQNKG NVSAFERRLR AHNRNLLSID 1800 

FDHITRTGKI YDDHRKFTLR ILYDQTGRPI LWSPVSRYNE VNITYSPSGL VTFIQRGTWN 1860 

EKMEYDQSGK IISRTWADGK IWSYTYLEKS VMLLLHSQRR YIFEYDQSDC LLSVTMPSKV 1920 

RHSLQTMLSV GYYRNIYTPP DSSTSFIQDY SRDGRLLQTL HLGTGRRVLY KYTKQARLSE 1980 

VLYDTTQVTL TYEESSGVIK TIHLMHDGFI CTIRYRQTGP LIGRQIFRFS EEGLVNARFD 2040 

YSYNNFRVTS MQAVINETFL PIDLYRYVDV SGRTEQFGKF SVINYDLNQV ITTTVMKHTK 2100 

IFSANGQVIE VQYEILKAIA YWMTIQYDNV GRHGNMCIRV GVDAN1TRYF YEYDADGQLQ 2160 

TVSVNDKTQW RYSYDLNGDI NLLSHGKSAR LTPLRYDLRD RITRLGEIQY KMDEDGFLRQ 2220 

RGNDIFEYNS NGLLQKAYNK ASGWTVQYYY DGLGRRVASK SSLGQHLQFF VDATANPIRV 2280 

THLYNHTSSE ITSLYYDLQG HLIAMELSSG EEYYVACDNT GTPLAVFSSR GQVIKEILYT 2340 

PYGDIYHDTY PDFQVIIGFH GGLYDFLTKL VHLGQRDYDV VAGRWTTAYH HIWKQLNLLP 2400 

KPFNLYSFEN NYPVGKIQDV AKYTTDIRSW LELFGFQLHN VLPGFPKPEL ENLELTYELL 2460 

RLQTKTQEWD PGKTZLGIQC ELQKQLRNFI SLDQLPMTPR YNDGRCLEGG KQPRFAAVPS 2520 

VFGKGIKFAI KDGIVTADII GVANEDSRRL AAILNNAHYL ENLHFTIEGR DTHYFIKLGS 2580 

LEEDLVLIGN TGGRRILENG VNVTVSQHTS LLNGRTRRFA DIQLQHGALC FNIRYGTTVE 2640 

EEKNHVLEIA RQRAVAQAWT KEQRRLQEGE EGIRAWTEGE KQQLLSTGRV QGYDGYFVLS 2700 
VEQYLELSDS ANNXHFMRQS EIGRR 

SEQ ID NO:231 PF04 ONA SEQUENCE: 

Nucleic Acid Accession #: NM_000441 
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Coding sequence: 



225-2567 (underlined sequences correspond to start and stop codons) 



l 



11 



21 



31 



41 



51 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CTCAGCCTTC CCGGTTCGGG AAAGGGGAAG AATGCAGGAG GGGTAGGATT TCTTTCCTGA 60 

TAGGATCGGT TGGGAAAGAC CGCAGCCTGT GTGTGTCTTT CCCTTCGACC AAGGTGTCTG 120 

TTGCTCCGTA AATAAAACGT CCCACTGCCT TCTGAGAGCG CTATAAAGGC AGCGGAAGGG 180 

TAGTCCGCGG GGCATTCCGG GCGGGGCGCG AGCAGAGACA GGTCATGGCA GCGCCAGGCG 240 

GCAGGTCGGA GCCGCCGCAG CTCCCCGAGT ACAGCTGCAG CTACATGGTG TCGCGGCCGG 300 

TCTACAGCGA GCTCGCTTTC CAGCAACAGC ACGAGCGGCG CCTGCAGGAG CGCAAGACGC 360 

TGCGGGAGAG CCTGGCCAAG TGCTGCAGTT GTTCAAGAAA GAGAGCCTTT GGTGTGCTAA 420 

AGACTCTTGT GCCCATCTTG GAGTGGCTCC CCAAATACCG AGTCAAGGAA TGGCTGCTTA 480 

GTGACGTCAT TTCGGGAGTT AGTACTGGGC TAGTGGCCAC GCTGCAAGGG ATGGCATATG 540 

CCCTACTAGC TGCAGTTCCT GTCGGATATG GTCTCTACTC TGCTTTTTTC CCTATCCTGA 600 

CATACTTTAT CTTTGGAACA TCAAGACATA TCTCAGTTGG ACCTTTTCCA GTGGTGAGTT 660 

TAATGGTGGG ATCTGTTGTT CTGAGCATGG CCCCCGACGA ACACTTTCTC GTATCCAGCA 720 

GCAATGGAAC TGTATTAAAT ACTACTATGA TAGACACTGC AGCTAGAGAT ACAGCTAGAG 780 

TCCTGATTGC CAGTGCCCTG ACTCTGCTGG TTGGAATTAT ACAGTTGATA TTTGGTGGCT 840 

TGCAGATTGG ATTCATAGTG AGGTACTTGG CAGATCCTTT GGTTGGTGGC TTCACAACAG 900 

CTGCTGCCTT CCAAGTGCTG GTCTCACAGC TAAAGATTGT CCTCAATGTT TCAACCAAAA 960 

ACTACAATGG AGTTCTCTCT ATTATCTATA CGCTGGTTGA GATTTTTCAA AATATTGGTG 1020 

ATACCAATCT TGCTGATTTC ACTGCTGGAT TGCTCACCAT TGTCGTCTGT ATGGCAGTTA 1080 

AGGAATTAAA TGATCGGTTT AGACACAAAA TCCCAGTCCC TATTCCTATA GAAGTAATTG 1140 

TGACGATAAT TGCTACTGCC ATTTCATATG GAGCCAACCT GGAAAAAAAT TACAATGCTG 1200 

GCATTGTTAA ATCCATCCCA AGGGGGTTTT TGCCTCCTGA ACTTCCACCT GTGAGCTTGT 1260 

TCTCGGAGAT GCfTGGCTGCA TCATTTTCCA TCGCTGTGGT GGCTTATGCT ATTGCAGTGT 1320 

CAGTAGGAAA AGTATATGCC ACCAAGTATG ATTACACCAT CGATGGGAAC CAGGAATTCA 1380 

TTGCCTTTGG GATCAGCAAC ATCTTCTCAG GATTCTTCTC TTGTTTTGTG GCCACCACTG 1440 

CTCTTTCCCG CACGGCCGTC CAGGAGAGCA CTGGAGGAAA GACACAGGTT GCTGGCATCA 1500 

TCTCTGCTGC GATTGTGATG ATCGCCATTC TTGOCCTGGG GAAGCTTCTG GAACCCTTGC 1560 

AGAAGTCGGT CTTGGCAGCT GTTGTAATTG CCAACCTGAA AGGGATGTTT ATGCAGCTGT 1620 

GTGACATTCC TCGTCTGTGG AGACAGAATA AGATTGATGC TGTTATCTGG GTGTTTACGT 1680 

GTATAGTGTC CATCATTCTG GGGCTGGATC TCGGTTTACT AGCTGGCCTT ATATTTGGAC 1740 

TGTTGACTGT GGTCCTGAGA GTTCAGTTTC CTTCTTGGAA TGGCCTTGGA AGCATCCCTA 1800 

GCACAGATAT CTACAAAAGT ACCAAGAATT ACAAAAACAT TGAAGAACCT CAAGGAGTGA 1860 

AGATTCTTAG ATTTTCCAGT CCTATTTTCT ATGGCAATGT CGATGGTTTT AAAAAATGTA 1920 

TCAAGTCCAC AGTTGGATTT GATGCCATTA GAGTATATAA TAAGAGGCTG AAAGCGCTGA 1980 

GGAAAATACA GAAACTAATA AAAAGTGGAC AATTAAGAGC AACAAAGAAT GGCATCATAA 2040 

GTGATGCTGT TTCAACAAAT AATGCTTTTG AGCCTGATGA GGATATTGAA GATCTGGAGG 2100 

AACTTGATAT CCCAACCAAG GAAATAGAGA TTCAAGTGGA TTGGAACTCT GAGCTTCCAG 2160 

TCAAAGTGAA CGTTCCCAAA GTGCCAATCC ATAGCCTTGT GCTTGACTGT GGAGCTATAT 2220 

CTTTCCTGGA CGTTGTTGGA GTGAGATCAC TGCGGGTGAT TGTCAAAGAA TTCCAAAGAA 2280 

TTGATGTGAA TGTGTATTTT GCATCACTTC AAGATTATGT GATAGAAAAG CTGGAGCAAT 2340 

GCGGGTTCTT TGACGACAAC ATTAGAAAGG ACACATTCTT TTTGACGGTC CATGATGCTA 2400 

TACTCTATCT ACAGAACCAA GTGAAATCTC AAGAGGGTCA AGGTTCCATT TTAGAAACGA 2460 

TCACTCTCAT TCAGGATTGT AAAGATACCC TTGAATTAAT AGAAACAGAG CTGACGGAAG 2520 

AAGAACTTGA TGTCCAGGAT GAGGCTATGC GTACACTTGC ATCCTGAAAG TGGGTTCGGG 2580 

AGGTCTCTAT GAGCAAGGAA TACAAGACAA AACTTCCTCA ATGCATTGAC TATTTCTTCA 2640 

GACTCAAAAC ACTCATTCTT TTTTCTATTA AGCCATTGAA AGAGAAGCAC TAAGACTGCT 2700 

TGTAGGCTTT ATTTATAAAA TAAACACCTT ATCCCTAACA TGGGCAAAAT GGCTAGAATT 2760 

ATTCAGACGA TTTGGCAGCG TCCAGGGTAA GCTGGTGTTA TAATACGCTG CTGATCTACA 2820 

TCACAGATTT GCTAATAATG TTCACGTGGG CCCTGGCATA TCTCTGTTCA GTTAGAGTGA 2880 

GTGCTGACCC AACAGCCTCT GTGGTCAAGC GAGTCACGAA TGATTAATCA TAAAGAAAAA 2940 

TCAGTTTTTG ACTGACCTGG ATATCCATGA GCTGCACTGA TCACCATGTA AGGTCACATT 3000 

TAGTAAATGC TGAAATAAAA TGATTAATGC ATTTATCAAT AAAAGCCTTT GAAAATACTT 3060 

TGGATAATAA ATTGGAGTTT TAAAAATGCA AATTTGCTTA GTATCTAATA ATGAAGTGTT 3120 

ATTACATATA GCCGGAATTG AGGATCTCTT TGATCCTGGA AATGGTTTAC CTAAAAGCTA 3180 

CAGAACCAGG CCAATATATT TTGAAATATT GATGCAGACA AATGAAATAA TAAAGAGATT 3240 

TTCATGGTTT ATAAAAATCT TTTTTGATAT GATAATAATC ATGATCACAA CTGAGATCAA 3300 

AAAAATATAT GACAGATTAT TTTGTTTAAA AATGCAGTTT TAATTATCTT AGTCTATAGA 3360 

AATGATCATT GCATGGAGGC ATGTATAGGT ATGATCTGTG TAAAATCTGA CATAAAAACA 3420 

GTGCTATTCT GAGTGAAAAT TTTTTTGATG TGCTTACATA ACCATGGTGA TTAAAATGAG 3480 

TTTATATTTT TTCTCAAAAA TTTTAGCAGT GTGTAAAGTA AGTAATCTTT AACTGAACTC 3540 

TGACCACTTA AAAAAAAATC TAAAAATTGA ACTACCTATA GTAGTCTGTG TTTAAAGTGA 3600 

ATTTTTAAAG ACAAAGCATT CTAAATGAAC TCAATATAAA AACATTCATT TGGAATGTAC 3660 

ATACTGAAAA ATACAGGTTT TTTTGACCAA AAGTTTTTAT ATCTTTTCTT TTTATTTATT 3720 

TTTTTCCTAA GTGCCAACAA TTTTCTAGAT ATTATATACA ACACAGGCTT TGATCTTGGG 3780 

GACTTTTCCC ATATATTTCA CACTGGAGTG AATGAAGTTG TACTTCATTT CTAGAGAAAA 3840 

GTTATACCCA GGTCCCCAAT TGAGAATGTC TTGCTTGATT GAAAACGACA TCATCCCTTG 3900 

GTATACTCCA GGGATTGGTT TCAGGACCCC TGCATTTACC AAAATTTGTG CACACTCAAG 3960 

TCCTGCAGTC ACCCCTGCCT AAAGATAGAA TGGCTTCTCT GTTTTTCTTC TGAAATACAA 4020 

CCAGAAACAA TGTGTCTATT TCTGAAAGAA TAGGATTAAT GATCATACAA ATGGGTTAAT 4080 

CCTGAATTCT GGTTGTAAAT CTGGTTACAG CATAACTAGG ATTATAATGC TGCCTCATTT 4140 

TCACAGCACT ACTTGCTTAT ATTGACAACA AATCATCTCG CTAAAGAGTG AATGTAGGCC 4200 

AGGCGCGGTG GCTCATGCCT GTAATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 4260 

GAGGTCAGGA GATCGAGACC ATCCTGGCTA ACATGGTAAA ACCCCGTCTC TACTAAAAAT 4320 

AGAAAAAAAG AAATTAGCCT AGCGTGGTGG CTGGCGGGCG CCTGTAGTCC CAGCTATTTG 4380 

GGAGGCTAAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGGT 4440 

CGTGCCACTG CACTCCAGCC TGGGCGACAG AGCAAGACTC CGTCTCAAAA AAAAAAAAAA 4500 
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AAAAAAAAAA AGAGTGAATG TAATAGTCTT GCAGAAAATG AATGAATACC TTTGTTCAAT 4560 

AAAGGAAATA TGCACTGCTC ACTTTTTTGA AGGAAATGCC AAAGTTACGT TTTACAACAA 4620 

GGCTAGAGTT TGTAAATTCT GGGTTCATTT GTGATGACAT AAGTCAGCAA ACTGCGGGAA 4680 

TACTGTCTCT TCTATGTATT TTGTGAATAG TAAGCATAAT TTTAGTTTTG TATTATCAAT 4740 

GAAAATTTCA CTTGAAATTA AAGCTGCCTT TTGTTATATT TTTAACCTAT AGGATAAGAT 4800 

TCCAGTATTG TATATGAGTT TTAACAAATT AAAAAATCAA ATCATGTACA TTTGAAAATA 4860 

TTTGCACACA TTTAAAAATA AATGTAAAGT TGTCTTTTAA ACTACTCGGA TGTGTCCTTT 4920 
CTGAACAAAA 



SEQ |D Nfrffi PFP4 Protein WWence; 

Protein Accession*: 043511 



1 11 21 31 41 51 

I I I I I I 

MAAPGGRSEP PQLPEYSCSY MVSRFVYSEL AFQQQHERRL QERKTLRESL AKCCSCSRKR 60 

AFGVLKTLVP ILEWLPKYRV KEWLLSDVIS GVSTGLVATL QGMAYALLAA VPVGYGLYSA 120 

FFPILTYFIF GTSRHISVGP FPWSLMVGS WLSMAPDEH FLVSSSNGTV LNTTMIDTAA 180 

RDTARVLIAS ALTLLVGIIQ LIFGGLQIGF IVRYLADPLV GGFTTAAAFQ VLVSQLKIVL 240 

NVSTKNYNGV LSIIYTLVEI FQNIGimJIA DFTAGLLTIV VCMAVKEUJD RFRHKIPVPI 300 

PIEVIVTIIA TAISYGANLE KNYNAGIVKS IPRGFLPPEL PPVSLFSEML AASFSIAWA 360 

YAIAVSVGKV YATKYDYTID GNQEFIAFGI SNIFSGFFSC FVATTALSRT AVQESTGGKT 420 

QVAGIISAAI VMIAILALGK LLEPLQKSVL AAWIANLKG MFMQLCDIPR LWRQNKTDAV 480 

1WVFTCIVSI ILGLDLGLLA GLIFGLLTW LRVQFPSWNG LGSIPSTDIY KSTKNYKNIE 540 

EPQGVKILRF SSPIFYGNVD GFKKCIKSTV GFDAIRVYNK RLKALRK1QK LIKSGQLRAT 600 

KNGIISDAVS TNNAFEPDED IEDLEELDIP TKEIEIQVDW NSELPVKVNV PKVPIHSLVL 660 

DCGAISFLDV VGVRSLRVIV KEFQRIDVNV YFASLQDYVI EKLEQCGFFD DNIRKDTFFL 720 

TVHDAILYLQ NGVKSQEGQG SILETITLIQ DCKDTLELIE TELTEEELDV QDEAMRTLAS 780 
QDEAMRTLAS 

SEQ ID NO:233 PFH2 DNA SEQUENCE: 

Nucleic Acid Accession #: NM.016029 

Coding sequence: 228-1097 (underlined sequences correspond to start and stop codons) 



l 11 21 31 41 51 

I I I I I I 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 

GGGCGTGCGC GGCCGCAATG AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 

TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGACGCTAC 180 

TATGGGCCGA GTGGCAGGGA CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 

TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 

TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 

TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 

CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 

TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 

ACAGAAAGCT AATAGAGCTT AACTACTTAG GGACGGTGTC CTTGACAAAA TGTGTTCTGC 600 

CTCACATGAT CGAGAGGAAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGTATCA 660 

TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTTTTTA 720 

ATGGCCTTCG AACAGAACTT GGCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 

GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 

GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 

TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 

CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 

AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 

AGACAAAACA TGACTGAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1140 

AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 

ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGCCAT GAATCTTGCA AA 



STO id wo?2?4 PFH? Protein sequence; 

Protein Accession*: NPJS7113 



1 11 21 31 41 51 

I I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MVVWVTGASS 60 
GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 
ATKAVLQEFG RIDILVNNGG HSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMXER 180 
KQGKIVTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 
IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWWITN KKGKKRIENF KSGVDADSSY FKIFKTKHD 



SEQ 10 N0:235 ACC5 DNA SEQUENCE 

Nucleic Add Accession* NMJXW450 
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Coding sequence: 1-1 833 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATTGCTT CACAGTTTCT CTCAGCTCTC ACTTTGGTGC TTCTCATTAA AGAGAGTGGA 60 

GCCTGGTCTT ACAACACCTC CACGGAAGCT ATGACTTATG ATGAGGCCAG TGCTTATTGT 120 

CAGCAAAGGT ACACACACCT GGTTGCAATT CAAAACAAAG AAGAGATTGA GTACCTAAAC 180 

TCCATATTGA GCTATTCACC AAGTTATTAC TGGATTGGAA TCAGAAAAGT CAACAATGTG 240 

TGGGTCTGGG TAGGAACCCA GAAACCTCTG ACAGAAGAAG CCAAGAACTG GGCTCCAGGT 300 

GAACCCAACA ATAGGCAAAA AGATGAGGAC TGCGTGGAGA TCTACATCAA GAGAGAAAAA 360 

GATGTGGGCA TGTGGAATGA TGAGAGGTGC AGCAAGAAGA AGCTTGCCCT ATGCTACACA 420 

GCTGCCTGTA CCAATACATC CTGCAGTGGC CACGGTGAAT GTGTAGAGAC CATCAATAAT 480 

TACACTTGCA AGTGTGACCC TGGCTTCAGT GGACTCAAGT GTGAGCAAAT TGTGAACTGT 540 

ACAGCCCTGG AATCCCCTGA GCATGGAAGC CTGGTTTGCA GTCACCCACT GGGAAACTTC 600 

AGCTACAATT CTTCCTGCTC TATCAGCTGT GATAGGGGTT ACCTGCCAAG CAGCATGGAG 660 

ACCATGCAGT GTATGTCCTC TGGAGAATGG AGTGCTCCTA TTCCAGCCTG CAATGTGGTT 720 

GAGTGTGATG CTGTGACAAA TCCAGCCAAT GGGTTCGTGG AATGTTTCCA AAACCCTGGA 780 

AGCTTCCCAT GGAACACAAC CTGTACATTT GACTGTGAAG AAGGATTTGA ACTAATGGGA 840 

GCCCAGAGCC TTCAGTGTAC CTCATCTGGG AATTGGGACA ACGAGAAGCC AACGTGTAAA 900 

GCTGTGACAT GCAGGGCCGT CCGCCAGCCT CAGAATGGCT CTGTGAGGTG CAGCCATTCC 960 

CCTGCTGGAG AGTTCACCTT CAAATCATCC TGCAACTTCA CCTGTGAGGA AGGCTTCATG 1020 

TTGCAGGGAC CAGCCCAGGT TGAATGCACC ACTCAAGGGC AGTGGACACA GCAAATCCCA 1080 

GTTTGTGAAG CTTTCCAGTG CACAGCCTTG TCCAACCCCG AGCGAGGCTA CATGAATTGT 1140 

CTTCCTAGTG CTTCTGGCAG TTTCCGTTAT GGGTCCAGCT GTGAGTTCTC CTGTGAGCAG 1200 

GGTTTTGTGT TGAAGGGATC CAAAAGGCTC CAATGTGGCC CCACAGGGGA GTGGGACAAC 1260 

GAGAAGCCCA CATGTGAAGC TGTGAGATGC GATGCTGTCC ACCAGCCCCC GAAGGGTTTG 1320 

GTGAGGTGTG CTCATTCCCC TATTGGAGAA TTCACCTACA AGTCCTCTTG TGCCTTCAGC 1380 

TGTGAGGAGG GATTTGAATT ATATGGATCA ACTCAACTTG AGTGCACATC TCAGGGACAA 1440 

TGGACAGAAG AGGTTCCTTC CTGCCAAGTG GTAAAATGTT CAAGCCTGGC AGTTCCGGGA 1500 

AAGATCAACA TGAGCTGCAG TGGGGAGCCC GTGTTTGGCA CTGTGTGCAA GTTCGCCTGT 1560 

CCTGAAGGAT GGACGCTCAA TGGCTCTGCA GCTCGGACAT GTGGAGCCAC AGGACACTGG 1620 

TCTGGCCTGC TACCTACGTG TGAAGCTCCC ACTGAGTCCA ACATTCCCTT GGTAGCTGGA 1680 

CTTTCTGCTG CTGGACTCTC CCTCCTGACA TTAGCACCAT TTCTCCTCTG GCTTCGGAAA 1740 

TGCTTACGGA AAGCAAAGAA ATTTGTTCCT GCCAGCAGCT GCCAAAGCCT TGAATCAGAC 1800 
GGAAGCTACC AAAAGCCTTC TTACATCCTT TAA 



SEQ ID NOt236 ACCS Protein sequence: 
Protein Accession #: NPJJ00441 



1 11 21 31 41 51 

I I I I I I 

MIASQFLSAL TLVLLIKESG AWSYHTSTEA HTYDEASAYC QQRYTHLVAI QNKEEIEYLN 60 

SILSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED CVEIYIKREK 120 

DVGMWNDERC SKKKLALCYT AACTNTSCSG HGECVETINN YTCKCDPGFS GLKCEQIVNC 180 

TALES PEHGS LVCSHPLGNF SYNSSCSISC DRGYLPSSME TMQCMSSGEW SAPIPACNW 240 

ECDAVTNPAN GFVECFQNPG SFPWNTTCTF DCEEGFELMG AQSLQCTSSG NWDNEKPTCK 300 

AVTCRAVRQP ONGSVRCSHS PAGEFTFKSS CNFTCEEGFM LCGPAQVECT TQGQWTQQIP 360 

VCEAFQCTAL SNPERGYHNC LPSASGSFRY GSSCEFSCEQ GFVLKGSXRL CCGPTGEWDN 420 

EKPTCEAVRC DAVHQPPKGL VRCAHSPIGE FTYKSSCAFS CEEGFELYGS TQLECTSQGQ 480 

WTEEVPSCQV VKCSSLAVPG KINMSCSGEP VFGTVCKFAC PEGWTLNGSA ARTCGATGHW 540 

SGLLPTCEAP TESNIPLVAG LSAAGLSLLT LAPFLLWLRK CLRKAKKFVP ASSCQSLESD 600 
GSYQKPSYIL 

SEQ ID NO:237 PM28 DMA SEQUENCE 

Nucleic Acid Accession*: N51002 

Coding sequence: 1-3793 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATGTGTG AAGTGATGCC CACGATTAAT GAGGACACCC CAATGAGCCA AAGGGGGTCC 60 

CAAAGCAGTG GCTCGGACTC AGACTCCCAT TTTGAGCAGC TGATGGTGAA TATGCTAGAT 120 

GAAAGGGATC GTCTTCTAGA CACCCTTCGG GAGACCCAGG AAAGCCTCTC ACTTGCCCAG 180 

CAAAGACTTC AGGATGTCAT CTATGACCGA GACTCACTCC AGAGACAGCT CAATTCAGCC 240 

CTGCCACAGG ATATCGAATC CCTAACAGGA GGGCTGGCTG GTTCTAAGGG GGCTGATCCA 300 

CCGGAATTTG CTGCACTGAC AAAAGAATTA AATGCCTGCA GGGAACAACT TCTAGAAAAG 360 

GAAGAAGAAA TCTCTGAACT TAAAGCTGAA AGAAACAACA CAAGACTATT ACTGGAGCAT 420 

TTGGAGTGCC TTGTGTCACG ACATGAAAGA TCACTAAGAA TGACGGTGGT AAAACGGCAA 480 

GCCCAGTCTC CCTCAGGAGT ATCCAGTGAA GTTGAAGTTC TCAAGGCACT GAAATCTTTG 540 

TTTGAGCACC ACAAGGCCTT GGATGAAAAG GTAAGGGAGC GACTGAGGGT TTCTTTAGAA 600 

AGAGTCTCTG CACTGGAAGA AGAACTAGCT GCTGCTAATC AGGAGATTGT TGCCTTGCGT 660 

GAACAAAATG TTCATATACA AAGAAAAATG GCATCAAGCG AGGGATCCAC AGAGTCAGAA 720 

CATCTTGAAG GGATGGAACC TGGACAGAAA GTCCATGAGA AGCGTTTGTC CAATGGTTCT 780 

ATAGACTCAA CCGATGAAAC TAGTCAAATA GTTGAACTAC AAGAATTGCT TGAAAAGCAA 840 

AACTATGAAA TGGCCCAGAT GAAAGAACGT TTAGCAGCCC TTTCTTCCCG AGTGGGAGAG 900 

GTGGAACAGG AAGCAGAGAC AGCAAGAAAG GATCTCATTA AAACAGAAGA AATGAACACC 960 

AAGTATCAAA GGGACATTAG GGAGGCCATG GCACAAAAGG AAGATATGGA AGAAAGAATT 1020 

ACAACCCTTG AAAAGCGTTA CCTCAGTGCT CAGAGAGAAT CTACCTCCAT ACATGACATG 1080 
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AATGATAAAC TAGAAAATGA GTTAGCAAAT AAAGAAGCTA TCCTACGGCA GATGGAAGAG 1140 

AAAAACAGAC AGTTACAAGA ACGTCTTGAG CTAGCTGAAC AAAAGTTGCA GCAGACCATG 1200 

AGAAAGGCTG AAACCTTGCC TGAAGTAGAG GCTGAACTGG CTCAGAGAAT TGCAGCCCTA 1260 

ACCAAGGCTG AAGAGAGACA TGGAAATATT GAAGAACGTA TGAGACATTT AGAGGGTCAA 1320 

CTTGAAGAGA AGAATCAAGA ACTTCAAAGA GCTAGGCAAA GAGAGAAAAT GAATGAGGAG 1380 

CATAACAAGA GATTATCGGA TACGGTTGAT AGACTTCTGA CTGAATCCAA TGAACGCCTA 1440 

CAACTACACT TAAAGGAAAG AATGGCTGCT CTAGAAGAAA AGAATGTTTT AATTCAAGAA 1500 

TCAGAAACTT TCAGAAAGAA TCTTGAAGAA TCTTTACATG ATAAGGAAAG ATTAGCAGAA 1560 

GAAATTGAAA AGCTGAGATC TGAACTTGAC CAATTGAAAA TGAGAACTGG CTCTTTAATT 1620 

GAACCCACAA TACCAAGAAC TCATCTAGAC ACCTCAGCTG AGTTGCGGTA CTCAGTGGGA 1680 

TCCCTAGTGG ACAGCCAGTC TGATTACAGA ACAACTAAAG TAATAAGAAG ACCAAGGAGA 1740 

GGCCGCATGG GTGTGCGAAG AGATGAGCCA AAGGTGAAAT CTCTTGGGGA TCACGAGTGG 1800 

AATAGAACTC AACAGATTGG AGTACTAAGC AGCCACCCTT TTGAAAGTGA CACTGAAATG 1860 

TCTGATATTG ATGATGATGA CAGAGAAACA ATTTTTAGCT CAATGGATCT TCTCTCTCCA 1920 

AGTGGTCATT CCGATGCCCA GACGCTAGCC ATGATGCTTC AGGAACAATT GGATGCCATC 1980 

AACAAAGAAA TCAGGCTAAT TCAGGAAGAA AAAGAATCTA CAGAGTTGCG TGCTGAAGAA 2040 

ATTGAAAATA GAGTGGCTAG TGTGAGCCTC GAAGGCCTGA ATTTGGCAAG GGTCCACCCA 2100 

GGTACCTCCA TTACTGCCTC TGTTACAGCT TCATCGCTGG CCAGTTCATC TCCCCCCAGT 2160 

GGACACTCAA CTCCAAAGCT CACCCCTCGA AGCCCTGCCA GGGAAATGGA TCGGATGGGA 2220 

GTCATGACAC TGCCAAGTGA TCTGAGGAAA CATCGGAGAA AGATTGCAGT TGTGGAAGAA 2280 

GATGGTCGAG AGGACAAAGC AACAATTAAA TGTGAAACTT CTCCTCCTCC TACCCCTAGA 2340 

GCCCTCAGAA TGACTCACAC TCTCCCTTCT TCCTACCACA ATGATGCTCG AAGTAGTTTA 2400 

TCTGTCTCTC TTGAGCCAGA AAGCCTCGGG CTTGGTAGTG CCAACAGCAG CCAAGACTCT 2460 

CTTCACAAAG CCCCCAAGAA GAAAGGAATC AAGTCTTCAA TAGGACGTTT GTTTGGTAAA 2520 

AAAGAAAAAG CTCGACTTGG GCAGCTCCGA GGCTTTATGG AGACTGAAGC TGCAGCTCAG 2580 

GAGTCCCTGG GGTTAGGCAA ACTCGGAACT CAAGCTGAGA AGGATCGAAG ACTAAAGAAA 2640 

AAGCATGAAC TTCTTGAAGA AGCTCGGAGA AAGGGATTAC CTTTTGCCCA GTGGGATGGG 2700 

CCAACTGTGG TCGCATGGCT AGAGCTTTGG TTGGGAATGC CTGCGTGGTA CGTGGCAGCC 2760 

TGCCGAGCCA ACGTGAAGAG TGGTGCCATC ATGTCTGCTT TATCTGACAC TGAGATCCAG 2820 

AGAGAAATTG GAATCAGCAA TCCACTGCAT CGCTTAAAAC TTCGATTAGC AATCCAGGAG 2880 

ATGGTTTCCC TAACAAGTCC TTCAGCTCCT CCAACATCTC GAACTCCTTC AGGCAACGTT 2940 

TGGGTGACTC ATGAAGAAAT GGAAAATCTT GCAGCTCCAG CAAAAACGAA AGAATCTGAG 3000 

GAAGGAAGCT GGGCCCAGTG TCCGGTTTTT CTACAGACCC TGGCTTATGG AGATATGAAT 3060 

CATGAGTGGA TTGGAAATGA ATGGCTTCCC AGCTTGGGGT TACCTCAGTA CAGAAGTTAC 3120 

TTTATGGAAT GCTTGGTAGA TGCAAGAATG TTAGATCACC TAACAAAAAA AGATCTCCGT 3180 

GTCCATTTAA AAATGGTGGA TAGTTTCCAT CGAACAAGTT TACAATATGG AATTATGTGC 3240 

TTAAAGAGGT TGAATTATGA CAGAAAAGAA CTAGAAAGAA GACGGGAAGC AAGCCAACAT 3300 

GAAATAAAAG ACGTGTTGGT GTGGAGCAAT GACCGAATTA TTCGCTGGAT ACAAGCAATT 3360 

GGACTTCGAG AATATGCAAA TAATATACTT GAGAGCGGTG TGCATGGCTC ACTTATAGCC 3420 

CTGGATGAAA ACTTTGACTA CAGCAGCTTA ACTTTATTAT TACAGATTCC AACACAGAAC 3480 

ACCCAGGCAA GGCAGATTCT TGAAAGAGAA TACAATAACC TCTTGGCCCT GGGAACTGAA 3540 

AGGCGACTGG ATGAAAGTGA TGACAAGAAC TTCAGACGTG GATCAACCTG GAGAAGGCAG 3600 

TTTCCTCCTC GTGAAGTACA TGGAATCAGC ATGATGCCTG GGTCCTCAGA AACATTACCA 3660 

GCTGGATTTA GGTTAACCAC AACCTCTGGG CAATCAAGAA AAATGACAAC AGATGTTGCT 3720 

TCATCAAGAC TGCAGAGGTT AGACAACTCC ACTGTTCGCA CATACTCATG TCTCGAGTAA 3780 
GCGGCCGCTT TAA 



sea ro no?2w ?m rm\ wwro; 

Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

MMCEVMPTIN EDTPHSQRGS QSSGSDSDSH FBQLKVNMLD ERDRLLDTLR ETQESLSLAQ 60 

CRLQDVIYDR DSLQRQLNSA LPQDIESLTG GLAGSKGADP PEFAALTKEL NACREQLLEK 120 

EEEISELKAE RNNTRLLLEH LECLVSRHER SLRMTWKRQ AQSPSGVSSE VEVLKALKSL 180 

FEHHKALDEK VRERLRVSLE RVSALEEELA AANQEIVALR EQNVHIQRKM ASSEGSTESE 240 

HLEGMEPGQK VHEKRLSNGS IDSTDETSQI VELQELLEKQ NYEMAQMKER LAALSSRVGE 300 

VEQEAETARK DLIKTEEMNT KYQRDIREAM AQKEDHEERI TTLEKRYLSA QRESTSIHDM 360 

NDKLENELAN KEAILRQMEE KNRQLQERLE LAEQKLQQTM RKAETLPEVE AELAQRIAAb 420 

TKAEERHGNI EERMRHLEGQ LEEKNQELQR ARQREKMNEE HNKRLSDTVD RLLTESNERL 480 

QLHLKERKAA LEEKNVLIQE SETFRKNL EE SLHDKERLAE EIEKLRSELD QLKMRTGSLI 540 

EPTIPRTHLD TSAELRYSVG SLVDSQSDYR TTKVIRRPRR GRMGVRRDEP KVKSLGDHEW 600 

NRTOQIGVLS SHPFESDTEM SDIDDDDRET IFSSMDLLSP SGHSDAQTLA MMLQEQLDAI 660 

KKEIRLIQEE KESTELRAEE IENRVASVSL EGLNLARVHP GTSITASVTA SSLASSSPPS 720 

GHSTPKLTPR SPAREMDRMG VMTLPSDLRK HRRKIAWEE DGREDKATIK CETSPPPTPR 780 

ALRMTHTLPS SYHNDARSSL SVSLEPESLG LGSANSSQDS LHKAPKKKGI KSSIGRLFGK 840 

KEKARLGQLR GFMETEAAAQ ESLGLGKLGT QAEKDRRLKK KHELLEEARR KGLPFAQWDG 900 

PTWAWLELW LGMPAWYVAA CRANVKSGAI MSALSDTEIQ REIGISNPLH RLKLRLAIQE 960 

MVSLTSPSAP PTSRTPSGNV WVTHEEMENL AAPAKTKESE EGSWAQCPVF LQTLAYGDMN 1020 

HEWIGNEWLP SLGLPQYRSY FMECLVDARM LDHLTKKDLR VHLKMVDSFH RTSLQYGIMC 1080 

LKRLNYDRKE LERRREASQH EIKDVLVWSN DRIIRWIQAI GLREYANNIL ESGVHGSLIA 1140 

LDENFDYSSL TLLLQIPTQN TQARQILERE YNNLLALGTE RRLDESDDKN FRRGSTWRRQ 1200 
FPPREVHGIS MKPGSSETLP AGFRLTTTSG QSRKMTTDVA SSRLQRLDNS TVRTYSCLE 

SEQ (D N0^39 PC14 0NA SEQUENCE 

Nucleic Add Accession*: NM.016S70 

Coding sequence: 1-11 34 (underlined sequences correspond to start end stop codons) 
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ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

5 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

10 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

15 TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

20 ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATC GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 

GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

25 $pqpnO;24QPC<4 Protein gewgnce; 

Protein Accession #: NP.057654 

1 11 21 31 41 51 

on I 1 I I I I 

D\J mrrlnrkxtl slvkeldafp kvpesyvets asggtvslia FTTMALLTIM EFSVYQDTWM 60 

KYEYEVDKDF SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 
KEWQRMLQLI QSRLQEEHSL QDVIFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 
VAGNFHITVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTEKIA 240 
IDHNQMFQYF ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 
35 MVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKPVNSVPF 360 
EDGHTDNHLP LLENNTH 

SEQ 03 NO:241 PBA7 DNA SEQUENCE 

Nudric Add Accession*: AA219134 

40 Coding sequence: 24-181 5 (underlined sequences correspond to start and stop codons) 

AATTCGCCCT TGCTTAATTA AGCATGTTTA CCTTCCTGTC ATCTGTCACT GCTGCTGTCA 60 
. GTGGCCTCCT GGTGGGTTAT GAACTTGGGA TCATCTCTGG GGCTCTTCTT CAGATCAAAA 120 

45 CCTTATTAGC CCTGAGCTGC CATGAGCAGG AAATGGTTGT GAGCTCCCTC GTCATTGGAG 180 

CCCTCdTGC CTCACTCACC GGAGGGGTCC TGATAGACAG ATATGGAAGA AGGACAGCAA 240 

TCATCTTGTC ATCCTGCCTG CTTGGACTCG GAAGCTTAGT CTTGATOCTC AGTTTATCCT 300 

ACACGGTTCT TATAGTGGG A CGCATTGCCA TAGGGGTTTC CATCTCCCTC TCTTCCATTG 360 

CCACTTGTGT TTACATCGCA GAGATTGCTC CTCAACACAG AAGAGGCCTT CITGTGTCAC 420 
50 TGAATGAGCT GATGATTGTC ATCGGCATTC TTTCTGCCTA TATTTCAAAT TACGCATTTG 480 

CCAATGTTTT CCATGGCTGG AAGTACATGT TTGGTCTTGT GATTOCCTTG GGAGTTTTGC 540 

AAGCAATTGC AATGTATTTT CTTCCTCCAA GCCCTOGGTT TCTGGTGATG AAAGG ACAAG 600 

AGGGAGCTGC TAGCAAGGTT CTTGGAAGGT TAAGAGCACT CTCAGATACA ACTGAGGAAC 660 

TCACTGTGAT CAAATCCTCC CTGAAAGATG AATATCAGTA CAGTTTTTGG GATCTGTTTC 720 
55 GTTCAAAAG A CAACATGCGG ACCCGAATAA TGATAGGACT AACACTAGTA TTTTTTGTAC 780 

AAATCACTGG CCAACCAAAC ATATTGTTCT ATGCATCAAC TGTTTTGAAG TCAGTTGGAT 840 

TTCAAAGCAA TGAGGCAGCT AGCCTCGCCT CCACTGGGGT TGGAGTCGTC AAGGTCATT A 900 

GCACCATCCC TGCCACTCTT CTTGTAGACC ATGTCGGCAG CAAAACATTC CTCTGCATTG 960 

GCTCCTCTGT GATGGCAGCT TCGTTGGTGA CCATGGGCAT CGTAAATCTC AACATCCACA 1020 
60 TGAACTTCAC CCATATCTGC AGAAGCCACA ATTCTATCAA CCAGTCCTTG GATGAGTCTG 1080 

TGATTTATGG ACCAGGAAAC CTGTCAACCA ACAACAATAC TCTCAGAGAC CACTTCAAAG 1 140 

GGATTTCTTC CCATAGCAGA AGCTCACTCA TGCCTCTGAG AAATGATGTG GATAAGAGAG 1200 

GGGAGACGAC CTCAGCATCC TTGCTAAATG CTGGATTAAG CCACACTGAA TACCAGATAG 1260 

TCACAGACCC TGGGGACGTC CCAGCTTTTT TGAAATGGCT GTCCTTAGCC AGCTTGCTTG 1320 
65 TTTATGTTGC TG CTTTTTC A ATTGGTCTAG GACCAATGCC CTGGCTGGTG CTCAGCGAGA 1380 

TCTTTCCTGG TGGGATCAGA GGACGAGCCA TGGCTTTAAC TTCTAGCATG AACTGGGGCA 1440 

TCAATCTCCT CATCTCGCTG ACATTTTTGA CTGTAACTG A TCTTATTGGC CTGCCATGGG 1500 

TGTGCTTTAT ATATACAATC ATGAGTCTAG ATCTTATTGG CCTGCCATGG GTGTGCTTTA 1560 

TATATACAAT C ATG AGTCTA GCATCCCTGC TTTTTGTTGT TATGTTTATA CCTGAGACAA 1620 
70 AGGGATCCTC TTTGGAACAA ATATCAATGG AGCTAGCAAA AGTGAACTAT GTGAAAAACA 1680 

ACATTTGTTT TATGAGTCAT CACCAAGA AG AATTAGTGCC AAAACAGCCT CAAAAAAGAA 1740 

AACCCCAGGA GCAGCTCTTG GAGTGTAACA AGCTGTGTGG TAGGGGCCAA TCCAGGCAGC 1800 

TTTCTCCAGA GACCTAATGG CCTCAACACC TTCTGAACGT GGATAGTGCC AGAACACTTA 1860 

GG AGGGTGTC TTTGG ACCAA TGCATAGTTG CG ACTCCTGT GCTCTCTTTT CAGTGTCATG 1920 
75 GAACTGGTTT TG A AGAGACA CTCTGAAATG ATAAAGACAG CCTTTAATCC CCCTCCTCMC 1980 

CAGAAGGAAC CTC AAAAGGT AGATG AGGTA CAAGGTCCTA AGTG ATCTCT TTTTCTGAGC 2040 

AGGATATCAG GTTAAAAAAA AAAAGTTACT GGCTGGTTTA ATACTTTCTA OCTTCTTCAC 2100 

AGAGCAGCCT TTGAATAG AC TATGTOCTAG TGAAGACATC AACCTCCGCC TTAAGCTATG 2160 

TATGTATGGA GGCCAGTCGC AGCTTTATTA TGCAGACACA CAAGTGGTCT GGACATGAGG 2220 
80 GTACAGTTTC TGCCTACCAA GACACTACTT GCACTGGATC TTACGCAAAA AAGAACCAG A 2280 
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ACACACAGTG TGGACAACTG CCCATATATT CTATCTAGAT TAGGAG AGGG TCCTGGCTAG 2340 
GATTTTAGTG GTAATTCCTA GTTACATTCA ACAAGTATAA AGATTATAGA GCTTATTTTA 2400 
TGAACTATAA ACTATAATTT AATGCAAAAT ATCCTTTTAT GAATTTCATG TTAATATTGT 2460 
GAAATATTAA AATAATTCCR CAATAGTTGA G AAAAATG AG CATTTTTTTC CATTTTTAAA 2520 
AAATGCATAG AAAAGACAAT TTTAAAATCC TGGGACCATA TTTATTTAGA AGTAGCTGTT 2580 
AGTAAAACAT TAGAAAAGGA GTCAGGCCAT TAGGTTATTT ATCCAAATCT CTAAGCAATT 2640 
AGGTTGAAGT TATTAAGTCA AGCCTAGAAA AGCTGCCTCC TTGTAAGGCT TTCATGACAA 2700 
TGTATAGTAA TCCACAGTGT CCAATTCTTC ACACTCCTCA GGAATATCAC TACCTCAGGT 2760 
TACGGTAC AC AGGCTATAAT TG ATGATGAT GTTCAG ATAA CTGAAGACAC AATAAATG AC 2820 
ATTCAGACAT CAGGAMAAWW CCCTCATGTT CTTTTCTATG ATGGOCACCT GTACCAGCAA 2880 
CGTGGGTTTC ACCCACACAA CGATGAACTG TTCTCTTACT TCTCCAGTTG ATTTTAAAGA 2940 
CTTGTTAAGA GGTCTTACTA ATAAAATTTG GGTATGATAG AAAAWCCACA ATCAAAWCTT 3000 
GAACCAAATA ACATATTAAA TTACTAATAT TTAAGTGATG GAAG ACACAC AAAAAACTTA 3060 
AAAGCACG AA CAACCTAACT TG AAAAAGAA TTTTAAAATA TG ATTAACCT G AAG AAAAG A 3120 
GAATCCTAAG AGCCAAAGCT CCTTTTTATT TAGCTTGGAA TTTTCCTATT GGTTCCTAAC 3180 
AAACTGTCCC AATGTCATAT AAGG AAACAT GATCTATTAC ATTCCTTTAT AACAATGTGG 3240 
AGAGACTATA AACCTATGTA AGTAGTAAAA CTATATYAGA GACTCAGGAG ACTGACTAAA 3300 
AGGCCTGG AT CTGCAGTGTA TTATCTGTAT AAAAATTGGC AGGGGG AAGC TAAAAGG AAA 3360 
GGAGATTGGA GATCTCAATT CTATCATGGT GTATTTCATA CGCAAATCAG AGCATGCATT 3420 
G il HUGH TTTGGAAAGA GAAGGGAAGT GTGTTCTGCC CCATGTTTCC TTCCGTGTTT 3480 
ATAGTTCAAA CTCTATATAT ACTTCAGGTA TTTTTTGTTT AGCCCTTCAT TATAAATGGG 3540 
CAGGAAATTG TTTATCAACC TAGOCAGTTT ATTACTAGTG ACCTTGACTT CAGTATCTTG 3600 
AGCATTCITr TATATTTTTC TnTATTATC CTGAGTCTGT AACTAAACAA TTTTGTCTTC 3660 
AAATTTTTAT CCAATATCCA TTGCACCACA CCAAATCAAG CTTCTTG ATT TTCAAAAATA 3720 
AAAAGGGGG A AATACTTACA ACTTGTACAT ATATATTCAC AGTTTTTATT TATAAAAAAA 3780 
ATTTACAGTA CTTATGGAGA GCCAGCAGAA GACATCAGAG CACTCACTTC TTCOCATCTT 3840 
TGTTAAGGTT AGCG AATTAC CCATGGACAC TGTTAGGTGA GGCTCATTCG GCAGCCCTGA 3900 
AAACAAACCT GGTCACACTG TCTTTACCCT CTCCCTTCAG ATAAAGCACT TCGATTATCT 3960 
ATTGATCTGC CCAGTTTTCA AGTCATGCGA ATACTAAAAA GGTTACATCA TCTGGATCTG 4020 
TACCTTGGCT ATATAAGCAT GTTTTCCOCC TATTCTATGT TTCTTTTTTT GGTGAACATT 4080 
GAAAAACAGG AGGTGACTTA TTACTGTTAA TTAAAACTAA ATGAAAAATG TCAAGTCTTT 4140 
AAAACAGTGA GCTTGTAACT CTTTCATGTA ATTTTATTCT CTATGAATTT GGCTATCCTA 4200 
CTGAATCTTA AAATAAAGG A AATAAAC ACT TTTTTTTWAA AAAAAAGG AA AAATAM AARW 4260 
MWAAAAATCT CAATG AAATA TTTCACAAGA AGGAAAAA 



Protein Accession #: AAF91431 

MFIPLSSVTA AVSGLLVGYE LGI1SGAIXQ IKTLLALSCH EQEMWSSLV IGALLASLTG 60 
GVUDRYGRR TAHLSSCLL GLGSLVULS LSYTVUVGR IAIGVSISLS SIATCVYIAE 120 
IAPQHRRGLL VSLNELMIVI GILSAYISNV AFANVFHGWK YMFGLVIPLG VLQAIAMYFL 180 
PPSPRFLVMK GQEGAASKVL GRLRALSDTT EELTVIKSSL KDEYQYSFWD LFRSKDNMRT 240 
RIMIGLTLVF FVQITGQPNI UFY ASTVLKS VGFQSNEAAS LASTGVGWK VIST1PATLL 300 
VDHVGSKTFL CH3SS VMAAS LVTMGIVNLN IHMNFIHICR SHNSINQSLD ESVIYGPGNL 360 
STNNNTLRDH FKGISSHSRS SLMPLRNDVD KRGETTS ASL LNAGLSHTEY QIVTDPGDVP 420 
AFLKWLSLAS LLVYVAAFSI GLGPMPWLVL SHFPGGIRG RAMALTSSMN WGINLUSLT 480 
FLTVTDLIGL PWVCHYTIM SLDliGLPWV CFIYTIMSLA SLLFWMFIP ETKGCSLEQI 540 
SMELAKVNYV KNNICFMSHH QEELVPKQPQ KRKPQEQLLE CNKLCGRGQS RQLSPET 



SEQ ID NO:243 PAB4 DMA seouence: 
Nuctete Acid Accession* AA172056 



TTTA GCCA CC AGAGGANTTC TCTTG AAATA CCCAAAATCC ATCAGTATCT TG AATCATGC 60 
TGGATTTTGA AGAATTCTTA AGAAGOCATG TAAAGGGGGC TCTCTGGCCT TG AAATAGTG 120 
ATOrnTlTA TACAGAAAGG AG AATGCAGA ATGGTCAGAC TATCATGCAC TGTTAAATIT 180 
GATTTCAAGA AATTACAGGA AAACTTTCCA AAGTTCCATC TCACAGAANN TTATTTITJOC 240 
AAGAATTCCA AG ATAAGTTT AGTTTTATGG AAGACTTTTA TGTGGTTTTT ACTCACTCTT 300 
CATCTCAG AC ATCG ACAG AT GATTACATCA CTTATAGTTC TAGTAAATTT ATTAATATAA 360 
AACTCAG AG A C ATTCCAATA TCCACATTGC TTACACCATT AGGCATAGAT TCAGTGTCAG 420 
CTATGACAAT TG AAAATG AG CTGTTTTGTG ATTTAAAGGT TTAAATTTCT CTAACCAAAC 480 
TGCTTGATCC AG ATGCAGG A CTGCAAATGT TAATATTTGT T CTGG A AGAA C AATCAAATA 540 
AGACTTAAGA GGAAAGGGAA TGGCCACAAT CCACCTGAAA TTTTTTCTTA AAAAGTGTGC 600 
AGCCTACTAA ATCAGAATGA AAATAGAAGT ACAAGATTAT AAACAAAATG CAATCAAACT 660 
TTTCTTAAGC TTACCTAAAG TTATTTCATC TGAAAATTTC AAGCAACTTT GTTCAACATT 720 
AAATTGACAA TCTAAACTAA CAAGTCTTTT GAATTTATGC ATGGTAGTAA ACATTCTCTC 780 
TATTAACTTT ATTAOCTAAG GCTAAACCTA AAATTTTTAA GCAAAATTAG AAAAATAGTC 840 
TTCACTCATC AAAAAATAAA GTTTGTTACA TTTAGTATTT TCCCAATAAA ATTGGTCGTT 900 
CTTGGTTTTT TATTTGGAGA GTCTGTGCAA AATGTCACTA AAAATAAATT AGCACTAG AA 960 
ATTATTTCTA AAT ACCAAA 



$EQ ID fJO;242 PBA7 Protein segyence: 



Coding sequence: 



121-339 (underlined sequences correspond to start and stop codons) 



Nucleic Acid Accession?: 
Coding sequence: 3 




1 



11 



21 



31 



41 



51 



403 



WO 02/30268 



PCT/US01/32045 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



I 

AAATGGCGTG 
CCTGGGCTCC 
GTGGCCCCAG 
GGTGCGGAAC 
AAGAGGCCGC 



GCGCCGAAGC 
AGCAAGAGGA 
TGTCCGTGTG 
AGGGCCGGGA 
AGCCTGAATT 
TCATTTTCTT 
ACCTGATCCA 
AGGCAGCGTC 
GAATAGATCT 
AAGGTGGTCC 
AGCTTGCTCC 
CTGCCAATCT 
GTAGTGCTCA 
CATACTCTTC 
ATGATGACAG 
GAGGGATGCA 
GCTGTCAGAA 
CCCTCATTAG 
AAGGTAACCC 
CCGCAAAGGA 
CAGCTCCAGG 
GGGTTGATTT 
TGGAATGGTG 
CTTTAAATCT 
CAGTTAATAC 
AAATAAATAG 
TATTCATTTT 
ATCCTAGGCT 
TCTAGCTTTC 
AATGCTATTG 
TAAATAGTTC 
TGTTAATGCA 
AATAAAAATT 
TTAACACTAC 
CTGAATGAAT 



CCCGTCTCTC 
GCGGCCAGTA 
TGCGCGGGCT 
TTGCCGCCCC 
CCGCGTAGGA 
CGCGCTGCTG. 
CCAGGAGCCC 
CGGCATCTCC 
GCTGCAGTGC 
GCTCCTGGTC 
TAAATACATT 
GGCCCAGTAC 
CAGTACCCGC 
TCAGCCTGGT 
GAACCGGAAC 
AAATAATCAT 
TGAGACCAAG 
CCATGGAGGA 
CGAATACAGC 
TTTCAACCCG 
CAGCTTTGTA 
AGACTTCAAT 
GTTCCCACCT 
CTACCTTGAG 
AATTGCGAAT 
TGGTGATTAC 
CTATCTGGCA 
TGAACTGGAG 
GAAAATGATG 
ATCTATATAA 
TTAACATTGA 
CCTCTTAGGT 
CCTACCTATA 
TAAATGCAAT 
AAAAATTAGT 
AAAAGGTTAA 
AGTATAAATT 
TTTTTGATGG 
GACTTCTTGC 
TTAAAAGTTT 
AAAGGTTAAA 



CGCCGGCCCC 
GTGCAGCCCG 
GACACTCATT 
CAGCAGCGCC 
AGGCACGGCC 
GCTCTGTGCG 
GGGGCGCCCG 
TTCGAGTACC 
ACCGCCATCA 
ATCGAGCTGT 
GGGAATATGC 
CTATGCAACG 
ATTCACATCA 
GAACTCAAGG 
TTTCCAGACC 
CTGTTGAAAA 
GCTGTCATTC 
GACCTTGTGG 
TCCTCCCCAG 
GCCATGTCTG 
GATGGAACCA 
TACCTTAGCA 
GAAGAGACTC 
CAGATACACC 
GCCACCATCT 
TGGAGATTGC 
ATAACAAAGA 
TCATTTTCTG 
TCAGAAACTT 
TGTAGTATGA 
TTTATTTTTT 
AAAAATATAA 
TTACACAAAA 
ATTCCTGGTA 
GAAGTTCTTT 
CAGATACAGC 
GTCGTTTTTT 
GAAGAAAAGG 
TTGTACATAT 
AGGGTTTTCT 
AAAAAATCCC 



CTGCCTCGCA 
TGGAGCCGCG 
CAGCCGGGGA 
GGCGGGCTAA 
GGCGGCGGCG 
GGGCACTGGC 
CGGCGGGCAT 
ACCGCTACCC 
GCAGGATTTA 
CCGACAACCC 
ATGGGAATGA 
AATACCAGAA 
TGCCTTCCCT 
ACTGGTTTGT 
TGGATAGGAT 
ATATGAAGAA 
ATTGGATTAT 
CCAATTATCC 
ATGACGCCAT 
ACCCCAATCG 
CCAACGGTGG 
GCAACTGTTT 
TGAAGACCTA 
GAGGAGTTAA 
CCGTGGAAGG 
TTATACCTGG 
AAGTGGCAGT 
AAAGGAAAGA 
TAAATTT TTA 
TGTAATGTGG 
AATCATTTAA 
GAACTTGATA 
AAGTATAGAA 
TTATTTACAA 
TACTGTAATT 
TCGGAGTTGT 
TCTTGTGCTG 
TACATGTTTA 
AGGAGCAATA 
CTTGGTTGTA 
CAGTGAAAAA 



I 

GTGGTTTCTC 
GCTTTGCCCG 
AGGTGAGGCG 
GCCCAGGGCC 
GAGCGCAGCG 
TGCCTGCGGG 
GAGGCGGCGC 
CGAGCTGCGC 
CACGGTGGGG 
TGGCGTCCAT 
GGCTGTTGGA 
GGGGAACGAG 
GAACCCAGAT 
GGGTCGAAGC 
AGTGTACGTG 
AATTGTGGAT 
GGATATTCCT 
ATATGATGAG 
TTTCCAAAGC 
GCCACCATCT 
TGCTTGGTAC 
TGAGATCACC 
CTGGGAGGAT 
AGGATTTGTC 
AATAGACCAC 
AAACTATAAA 
TCCTTACAGC 
AGAGGAGAAG 
AAAAGGCTTC 
TCTTTTTTTT 
ATATTAATCA 
TATTTCATTC 
AAGATTTAAG 
TGCAGAATTT 
GGTGACAATG 
GAGCACTCTA 
ACTAACTATA 
CAAAGAGGTT 
CTATTATATT 
GAGTGGCCCA 
AAA 



CTGCAGCTCC 
TCTCCTCTGG 
AGTAGAGGCT 
GGGCAGACAA 
ATGGCCGGGC 
TGGCTCCTGG 
CGGCGGCTGC 
GAGGCGCTCG 
CGCAGCTTCG 
GAGCCTGGTG 
CGAGAACTGC 
ACAATTGTCA 
GGCTTTGAGA 
AATGCCCAGG 
AATGAGAAAG 
CAAAACACAA 
TTTGTGCTTT 
ACGCGGAGTG 
TTGGCCCGGG 
CGCAAGAATG 
AGCGTACCTG 
GTGGAGCTTA 
AACAAAAACT 
CGAGACCTTC 
GATGTTACAT 
CTTACAGCCT 
CCTGCTGCTG 
GAAGAATTGA 
TAGTTAGCTG 
AGATTTTGTG 
ACTTTCCTTA 
TCTTATATAG 
TAATTTTGCC 
TTTGAGTAAT 
TCACATAATG 
CTGCAAGACT 
AGCATGATCT 
TTATGAAAAG 
ATGTAGTCCG 
GAATTGCATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



Protein Accessions 



SEQ ID N0:245 PBQ8 Protein sequence 
P16870 



MAGRGGS ALL ALCGALAACG WIXGAEAQEP G APAAGMRRR RRLQQEDGIS FEYHRYPELR 60 
EALVSVWLQC TAJSRIYTVG RSFEGRELLV IELSDNPGVH EPGEPEFKYI GNMHGNEAVG 120 
RELUFLAQY LCNEYQKGNE TIVNUHSTR MIMPSLNPD GFEKAASQPG ELKDWFVGRS 180 
NAQGIDLNRN FPDLDRIVYV NEKEGGPNNH LLKNMKKTVD QNTKLAPETK A VIHWIMDIP 240 
FVLSANLHGG DLVANYPYDE TRSGS AHEYS SSPDDAIFQS LARAYSSFNP AMSDPNRPPC 300 
RKNDDDSSFV DGTTNGGAWY SVPGGMQDFN YLSSNCFEIT VELSCEKFPP EETLKTYWED 360 
NKNSUSYUE QIHRGVKGFV RDLQGNPIAN ATISVEGIDH DVTSAKDGDY WRLLIPGNYK 420 
LTAS APGYLA ITKKVAVPYS PAAGVDFELE SFSERKEEEK EELMEWWKMM SETLNF 



Nucleic Add Accession*: AF038968 

Cocfing sequence: 



81-11 07 (underlined sequence corresponds to start and stop codon) 



I 

GGGGCGACGT 
GTCGGGTGGG 
GACCCGGATC 
CCACCAGGAC 
GTGAAGATGC 
CCAGCTTATA 
CAAGAAGAAC 
CTCAGTCAAC 
CCTTGTTTCT 
CTTATGTACT 
TTGGCTTGGT 
TTCTTGCTTT 
AGGAGTGACA 
GTACATGTAC 
CTTACTGGTC 
TTCACAGCAT 
ACAACAGGTG 
AAAACTGTCC 



11 
I 

GAGCGCGCAG 
TGACGCCGAG 
TCAACAATCC 
TTGATGAATA 
CTAATGTACC 
CACAGATTGC 
TAGAAAGAAA 
ATGGTAGAAA 
ATCAGGAATT 
ACTTGTGGAT 
TTTGTGTTGA 
TTACTCCTTG 
GTTCATTTAG 
TCCAAGCTGC 
TCAACCAAAA 
CAGCAGTCAT 
CTAGTTTTGA 
AGACCGCAGC 



21 
I 

GGGGGCGGCG 
AGCCAGAGAG 
CTTCAAGGAT 
TAATCCATTC 
CAATACACAA 
AAAGGAACAT 
AGCCGCAGAA 
AAATATTTGG 
TTCTGTAGAC 
GTTCCATGCA 
TTCTGCAAGA 
TTCATTTGTC 
ATTCTTTGTA 
AGGATTTCAT 
TATTCCTGTT 
CTCACTAGTT 
GAAGGCCCAA 
TGCAAATGCA 



31 
I 

GCCTCGCCTC 
ATGTCGGATT 
CCATCAGTTA 
TCGGATTCTA 
CCAGCAATAA 
GCATTGGCCC 
TTAGATCGTC 
CCACCTCTTC 
ATTCCTGTAG 
GTAACACTGT 
GCGGTTGATT 
TGTTGGTACA 
TTCTTCTTCG 
AACTGGGGCA 
GGAATCATGA 
ATGTTCAAAA 
CAGGAGTTTG 
GCTTCAACTG 



41 
I 

GTCTCTCTCT 
TCGACAGTAA 
CACAAGTGAC 
GAACACCTCC 
TGAAACCAAC 
AAGCTGAACT 
GGGAACGAGA 
CTAGCAATTT 
AATTCCAAAA 
TTCTAAATAT 
TTGGATTGAG 
GACCACTTTA 
TCTATATTTG 
ATTGTGGTTG 
TGATAATCAT 
AAGTACATGG 
CAACAGGTGT 
CAGCATCTAG 



51 
I 

CTGCGCCTGG 
CCCGTTTGCC 
AAGAAATGTT 
ACCAGGCGGT 
AGAGGAACAT 
TCTTAAGCGC 
AATGCAAAAC 
TCCTGTCGGA 
GACAGTAAAG 
CTTCGGATGC 
TATCCTGTGG 
TGGAGCTTTC 
TCAGTTTGCT 
GATTTCATCC 
AGCAGCACTT 
ACTATATCGC 
GATGTCCAAC 
TGCAGCTCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



404 



WO 02/30268 PCT/USO 1/32045 



AATGCTTTCA AGGGTAACCA GATTTAAGAA TCTTCAAACA ATACACTGTT ACCTTTTGAC 1140 

TGTACCTTTT TCTCCAGTTA CTGTATTCTA CAAATATTTT TATGTTCAAA ACACACAGTA 1200 

CAGACAGCAT GGATATTTCC TGTTCACTTG TGCATGGGCT AAAACCAGGA AAACTTCCTT 1260 

GTCTTATTAC TTTACCTAAT AGTTTCTTAA TATTTCAGTG CCCCTTGCAG AAAAAATATT 1320 

5 ACATGCTAAA TAAATATTCT CCATATTTTT GGGGGATGAC ATTCAGTGAA TTATTTCAGT 1380 

GGTGACCCAC TGAAAATTAA TAATGGTACT TATGATTAAA AACGCATTTA ATACTAACTG 1440 

CAGTAGTTCT TTCAAGAATC TTTAGAGATA AGGATTGCAC ATTGGAAAAG TAAACCATGT 1500 

TTCATTCCTT TTTCCCTATT TATATTGAAA GAAATAGGCC AGCAGAGACT TAGGGATTTT 1560 

AAATTGGCTT GCT TTTT AGC TGTTTCAGTC ACCAGTGAAG AGCCTATGTG CATTTTGTAG 1620 

10 TAGATAATGT AAAATTTGTC ATCTTTTTCT TTTCTTTTTT TTAGAATAGC TGATATTTTG 1680 

ATAACAATCT CTAATTTGCA TGGGCACCAC ATTTCTTATA TTAAAAGAAT TAGTGTTTTG 1740 

GCTTCTGTAC TGCTTATGGT TGTAGGATTC AGGGGTTAAT GGAATCACAG AAATGATATT 1800 

CTGCAAGAAT TTCTTTTAAA TAAAAAGTTT GGGGGTGCAA TATAAGAAGT TTATATAATA 1860 

TGCAGTACAT TATCCAAAAG AGAAGGTAGT TAATGCAGTA GAAAGTAGTG GTAATAATTC 1920 

15 CTTTTT 



gEQ |D NQ: 247 PBY4 Protein. Sgquenpe; 

Protein Accession #: 

MSDFDSNPFA DPDLNNPFKD PS VTQVTRNV PPGLDEYNPF SDSRTPPPGG VKMPNVPNTQ 60 
PAJMKPTEEH PAYTQIAKEH ALAQAELLKR QEELERKAAE LDRREREMQN LSQHGRKNIW 120 
PPLPSNFPVG PCFYQEFSVD IPVEFQKTVK LMYYLWMFHA VTLFLNIPGC LAWBCVDSAR 180 
AVDFGLSILW FLLFTPCSFV C WYRPLYGAF RSDSSFRFFV FFFVYICQFA VHVLQAAGFH 240 
NWGNCGWISS LTGLNQNIPV GIMMIHAAL FTASAVISLV MFKKVHGLYR TTGASFEKAQ 300 
QEFATGVMSN KTVQTAAANA ASTAASSAAQ NAFKGNQI 



SEQ ID NO:248 PBH2 DNA sequence 
30 Nucleic Acid Accession*: none found 

Coding sequence: 1-61 3 (underlined sequence corresponds to start and stop codon) 



20 
25 



ATGAGAGACA ATAAATCGTG TGCTTTTTTC ATGGG AAAGT TAAATGTTTG TTTTG AAGGC 60 
ACAGTAATAG CAGGCTATTC AGTGTTTGCC ACTACCTGCA TCATTCATCT GGCTGTAGCT 120 
AGTGCACTAC AATTTCCTAA AAAGTCTTCT CACCCTCACA GGACTGCTCT ACATCTGGCC 180 
TCTGCCAATG GAAATTCAGA AGTAGTAAAA CTCCTGCTGG ACAGACGATG TCAACTTAAT 240 
ATCCTTGACA ACAAAAAGAG GACAGCTCTG ACAAAGGCCG TACAATGCCA GGAAGATGAA 300 
TGTGCGTTAA TGTTGCTGGA ACATGGCACT GATCCGAATA TTCCAGATGA GTATGGAAAT 360 
ACCGCTCTAC ACTATGCTAT CTACAATGAA GATAAATTAA TGGCCAAAGC ACTGCTCTTA 420 
TACGGTGCTG ATATCGAATC AAAAAACAAG CATGGCCTCA CACCACTGTT ACTTGGTGTA 480 
CATGAGCAAA AACAGCAAGT GGTGAAATTT TTA ATCAAGA AAAAAGCAAA TTTAAATGCA 540 
CTGGATAG AT ATGGAAGGTG TGTGACCTTG GGAACGTTAT TTACCACCAA ATATGTTGTC 600 
ATATATGAAA AGIAQ 



SEQ fO NO:249 PBH2 Protein sequence: 
Protein Accession*: none found 

50 MRDNKSCAFFMGKLNVCFEG TVIAGYSVFA TTCIIHLAVA SALQEPKKSS HPHRTALHLA 60 
SANGNSEWK LLLDRRCQLN ILDNKKRTAL TKAVQCQEDE CALMLLEHGT DPNIPDEYGN 120 
TALHYAIYNE DKLMAKALLL YGADESKNK HGLTPLLLGV HEQKQQWKF UKKKANLNA 180 
LDRYGRCVTL GTLFTTKYW IYEK 

55 

SEQ ID NO:250 PBJ1 DNA sequence 
Nucleic Add Accession!: XM_005829 

Coding sequence: 1-3043 (underlined sequence corresponds to start and stop codon) 

60 ATGG TGATCA TCTATCTTTC TTTCTGCAAT TATTACATGG AGTTCTACAG AGAAGAGCTT 60 
CCCCACATTG ACTATTTGAT TGACATTCAG TTTGCAACAG GAAAGGTTAC TCAGCCGGGA 120 
GAGGACACTT CCTACCATCA ATGCGCTCAG CTTGAAGCCA GAGACGAAGG CACCGACAGT 180 
TTATTATTAA ACAATGGCAG C AGCGCC ACG CTGAAGACAC G AACGCGCTG TTATGGAACC 240 
CCCAGAGGTC TCCCCCATCG TAGCCTGCTC CAGCCGACTC CGCCCACATG TAAAACGAAG 300 

65 ATCAGGAGCA GATTTGAAGA ATTACAAAGT GAATTGGTGC CAGTCAGCAT GTCAGAGACA 360 
GACCACATAG CCTCTACTTC CTCTGATAAA AATGTTGGGA AAACACCTGA ATTAAAGGAA 420 
GACTCATGCA ACTTGTTTTC TGGCAATGAA AGCAGCAAAT TAGAAAATGA GTCCAAACTA 480 
TTGTCATTAA ACACTGATAA AACTTTATGT CAACCTAATG AGCATAATAA TCGAATTGAA 540 
GCCCAGGAAA ATTATATTCC AGATCATGGT GG AGGTGAGG ATTCTTGTGC CAAAACAGAC 600 

70 ACAGGCTCAG AAAATTCTGA ACAAATAGCT AATTTTCCTA GTGGAAATTTTGCTAAACAT 660 
ATTTCAAAAA CAAATGAAAC AGAACAGAAA GTAAC ACAAA TATTGGTGGA ATTAAGGTCA 720 
TCTACATTTC CAGAATCAGC TAATGAAAAG ACTTATTCAG AAAGCCCCTA TGATACAGAC 780 
TGCACCAAGA AATTTATTTC AAAAATAAAG AGCGTTTCAG CATCAGAGGA TTTGTTGGAA 840 
GAAATAG A AT CTGAGCTCTT ATCTACGGAG TTTGCAGAAC ATCGAGTACC AAATGGAATG 900 

75 AATAAGGGAG AACATGCATT AGTTCTGTTT GAAAAGTGTG TGCAAGATAA ATATTTGCAG 960 
CAGGAACATA TCATAAAAAA GTTAATTAAA G AAAATAAG A AGCATCAGGA GCTCTTCGTA 1020 
GACATTTGTT CAGAAAAAGA CAATTTAAGA G AAGAACTAA AGAAAAGAAC AGAAACTGAG 1080 
AAGCAGCATA TGAACACAAT TAAACAGTTA GAATCAAGAA TAGAAGAACT TAATAAAG AA 1 140 
GTTAAAGCTT CCAGAGATCA ACTAATAGCT CAAGACGTTA CAGCTAAAAA TGCAGTTCAG 1200 

405 



35 
40 
45 



WO 02/30268 



CAGTTACACA AAG AGATGGC CCAACGGATG GAACAGGCCA ACAAG AAATG TGAAGAGGCA 1260 
CGCCAAG AAA AAGAAGCAAT GGTAATGAAA TATGTAAGAG GTGAG AAGGA ATCTTTAG AT 1320 
CTTCGAAAGG AAAAAGAGAC ACTTGAGAAA AAACTTAGAG ATGCAAATAA GGAACTTGAG 1380 
AAAAACACTA ACAAAATTAA GCAGCTTTCT CAGGAGAAAG GAOGGTTGCA CCAGCTGTAT 1440 
GAAACTAAGG AAGGCGAAAC GACTAGACTC ATCAGAGAAA TAGACAAATT AAAGGAAGAC 1500 
ATTAACTCTC ACGTCATC AA AGTAAAGTGG GCACAAAACA AATTAAAAGC TGAAATGGAT 1560 
TCACACAAGG AAACCAAAG A TAAACTCAAA G A AACAACA A CAAAATTAAC ACAAGCAAAG 1620 
G AAG AAGCAG ATCAG ATACG AAAAAACTGT CAGG ATATGA TAAAAACATA TCAGG AGTCA 1680 
GAAGAAATTA AATCAAATGA GCTTGATGCA A AGCTTAGAG TCACAAAAGG AGAACTTGAA 1740 
AAACAAATGC AAGAAAAATC TGACCAGCTA GAGATGCATC ATGCCAAAAT AAAGGAACTA 1800 
GAAGATCTGA AGAGAACATT TAAGGAGGGT ATGGATGAGT TAAGAACACT GAGAACAAAG 1860 
GTGAAATGTC TAGAAGATGA ACGATTAAGA ACAGAAGATG AATTATCAAA ATATAAGGAA 1920 
ATTATTAATC GCCAAAAAGC TGAAATTCAG AATTTATTGG ACAAGGTGAA AACTGCAGAT 1980 
CAGCTACAGG AGCAGCTTCA AAGAGGTAAG CAAGAAATTG AAAATTTGAA AGAAGAAGTG 2040 
GAAAGTCTTA ATTCTTTGAT TAATGACCTA CAAAAAGACA TCGAAGGCAG TAGGAAAAGA 2100 
GAATCTG AGC TGCTGCTGTT TACAGAAAGG CTCACTAGTA AG AATGCACA GCTTCAGTCT 2160 
GAATCCAATT CTTCGCAGTC ACAATTTGAT AAAGTTTCCT GTAGTGAAAG TCAGTTACAA 2220 
AGCCAGTGTG AACAAA1GAA ACAGACAAAT ATTAATTTGG AAAGTAGGTT GTTG AAAG AG 2280 
GAAGAACTGC GAAAAGAGGA AGTCCAAACT CTGCAAGCTG AACTCGCTTG TAGACAAACA 2340 
GAAGTTAAAG CATTGAGTAC CCAGGTAG AA GAATTAAAAG ATGAGTTAGT AACTCAGAGA 2400 
CGTAAACATG CCTCTAGTAT CAAGG ATCTC ACCAAACAAC TTCAGCAAGC ACG AAG AAAA 2460 
TTAG ATCAGG TTGAGAGTGG AAGCTATG AC AAAG AAGTCA GCAGCATGGG AAGTCGTTCT 2520 
AGTTCATCAG GGTCCCTG AA TGCTCGAAGC AGTGCAG AAG ATCG ATCTCC AGAAAATACT 2580 
GGGTCCTCAG TAGCTGTGGA TAACTTTOCA CAAGTAGATA AGGCCATGTT GATTG AGAGA 2640 
ATAGTTAGGC TGCAAAAAGC ACATGCOCGG AAAAATGAAA AGATAGAATT TATGGAGGAC 2700 
CACATCAAAC AACTGGTGG A AGAAATTAGG AAAAAAACAA AAATAATTCA AAGTTATATT 2760 
TTACGAGAAG AATCAGGCAC ACTTTCTTCA GAGGCATCTG ATTTTAACAA AGTTCATTTA 2820 
AGTAGACGGG GTGGCATCAT GGCATCTTTA TATACATCCC ATCCAGCTGA CAATGGATTA 2880 
ACATTGGAGC TCTCTTTGGA AATCAACCGA AAATTACAGG CTGTTTTGGA GGATACGTTA 2940 
CTAAAAAATA TTACTTTGAA GGAAAATCTA CAAACACTTG GAACAG AAAT AGAACGTCTT 3000 
ATTAAACACC AGCATGAACT AGAACAGAGG ACAAAGAAAA CCTAAAACAA GCCTCTTGCT 3060 
CAGTAAAGAG ACAAAAGCCA CACAGGAGTA GGTGCCACTG ACCTCTATTG TTGGAGACTT 3120 
TGTTCCACTT TTTGTTTCAG CCAGTAAAAA TATTGTTTTG CTTCATCTGT ACACAAAAAA 3180 
ATACCCinT ACAATATGAA TGCATTGCTG TATATACTGT AAGACTGAAA GCTTTGATGA 3240 
AATTTGTTTT TGTATGGTGC AATATG ACAG CCTGTCATTG AATCTAAACA ACTTAATTTG 3300 
CTTGTATTCA TAAG AAGTGT TG AACATTAC AAGGGC1T1T AT 



$Eq p fiQ;2?i ppji proton ww w; 
Protein Accession*: NP.060487 

MVDYLSPCN YYMEFYREEL PMDYLIDIQ FATGKVTQPG EDTSYHQCAQ LEARDEGTDS 60 
UXNNGSS AT LKTRTRCYGT PRGLPHRSLL QPTPPTCKTK IRSRFEELQS ELVP VSMSET 120 
DHIASTSSDK NVGKTPELKE DSCNLFSGNE SSKLENESKL LSLNTDKTLC QPNEHKNRIE 180 
AQENYIPDHG GGEDSCAKTD TGSENSEQIA NFPSGNFAKH ISKTNETEQK VTQILVELRS 240 
STFPESANEK TYSESPYDTD CTKKF1SKIK SVSASEDIXE EIESELLSTE FAEHRVPNGM 300 
NKGEHALVLF EKCVQDKYLQ QEHmCKUK ENKKHQELFV DICSEKDNLR EELKKRTETE 360 
KQHMNTDCQL ESRIEELNKE VKASRDQIIA QDVTAKNAVQ QLHKEMAQRM EQANKKCEEA 420 
RQEKEAMVMK YVRGEKESLD LRKEKETLEK KLRDANKELE KN1NKQCQLS QEKGRLHQLY 480 
ETKEGETTRL IREIDKLKED INSHVUCVKW AQNKLKAEMD SHKETKDKLK ETTTKLTQAK 540 
EEADQIRKNC QDMUCTYQES EEIKSNELD A KLRVTKGELE KQMQEKSDQL EMHHAKKEL 600 
EDLKRTFKEG MDELRTLRTK VKCLEDERLR TEDELSKYKE IINRQKAEIQ NLLDKVKTAD 660 
QLQEQLQRGK QEDENLKEEV ESLNSUNDL QKDIEGSRKR ESELLLFTER LTSKNAQLQS 720 
ESNSLQSQFD KVSCSESQLQ SQCEQMKQTN INLESRIXKE EELRKEEVQT LQAELACRQT 780 
EVKALSTQVE ELKDELVTQR RKHASS1KDL TKQLQQARRK LDQVESGS YD KEVSSMGSRS 840 
SSSGSLNARS SAEDRSPENT GSSVAVDNFP QVDKAMLIER IVRLQKAHAR KNEKIEFMED 900 
H1KQLVEEIR KKTKUQSYI LREESGTLSS EASDFNKVHL SRRGGIMASL YTSHPADNGL 960 
TLELSLEINR KLQAVLEDTL LKNTTLKENL QTLGTEIERL IKHQHELEQR TKKT 



Nucleic Add Accession*: 083760 

Coding sequence 56-1459 (underiined sequence corresponds to slart and stop cooon) 
1 11 21 31 41 51 

I I I I I I 

TTGCCGTGAA GGGCTGTGCG GTTCCCGTGC GCGCCGGAGC CTGCTGTGGC CTCTTATGCA 60 

CTCCACCACC CCCATCAGCT CCCTCTTCTC CTTCACCAGC CCCGCAGTGA AGAGACTGCT 120 

AGGCTGGAAG CAAGGAGATG AAGAGGAAAA GTGGGCAGAG AAGGCAGTGG ACTCTCTAGT 180 

GAAGAAGTTA AAGAAGAAGA AGGGAGCCAT GGACGAGCTG GAGAGGGCTC TCAGCTGCCC 240 

GGGGCAGCCC AGCAAATGCG TCACGATTCC CCGCTCCCTC GACGGGCGGC TGCAGGTGTC 300 

CCACCGCAAG GGCCTGCCCC ATGTGATTTA CTGTCGCGTG TGGCGCTGGC CGGATCTGCA 360 

GTCCCACCAC GAGCTGAAGC CGCTGGAGTG CTGTGAGTTC CCATTTGGCT CCAAGCAGAA 420 

AGAAGTGTGC ATTAACCCTT ACCACTACCG CCGGGTGGAG ACTCCAGTAC TGCCTCCTGT 480 

GCTCGTGCCA AGACACAGTG AATATAACCC CCAGCTCAGC CTCCTGGCCA AGTTCCGCAG 540 

CGCCTCCCTG CACAGTGAGC CACTCATGCC ACACAACGCC ACCTATCCTG ACTCTTTOCA 600 

GCAGCCTCCG TGCTCTGCAC TCCCTCCCTC ACCCAGCCAC GCGTTCTCCC AGTCCCCGTG 660 

CACGGCCAGC TACCCTCACT CCCCAGGAAG TCCTTCTGAG CCAGAGAGTC CCTATCAACA 720 

CTCAGTTGAC ACACCACCCC TGCCTTATCA TGCCACAGAA GCCTCTGAGA CCCAGAGTGG 780 
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5 
10 



CCAACCTGTA 
TCGACCAGTT 
CAACCGAGTT 
CGACCCTTCA 
CTCAACGATA 
GGGAGAGGTG 
CAACTATCAA 
CAAGGTCTTC 
TGAAGTCGTG 
GGGTGCTGAG 
TCATGGGCCA 
CATTTCTTCA 



GATGCCACAG 
TGTTACGAGG 
GGGGAGACAT 
AATAACAGGA 
GAAAATAOCA 
TATGCCGAGT 
CACGGCTTCC 
AACAACCAGC 
TATGAACTGA 
TATCATCGCC 
CTGCAGTGGC 
GTGTCTTAAC 



CTGATAGACA 
AGCCCCAGCA 
TCCAGGCTTC 
ACAGATTCTG 
GGAGACATAT 
GCGTGAGTGA 
ACCCAGCTAC 
TCTTCGCTCA 
CCAAGATGTG 
AGGATGTCAC 
TGGACAAAGT 
AGTCATGTCT 



TGTAGTGCTA 
CTGGTGCTCG 
CTCCCGAAGT 
TCTTGGACTT 
AGGAAAGGGT 
CAGCAGCATC 
CGTCTGCAAG 
GCTCCTGGCC 
TACTATCCGG 
CAGCACCCCC 
TCTGACTCAG 
TAAGCTGCAT 



TCGATACCAA 
GTCGCCTACT 
GTGCTCATAG 
CTTTCTAATG 
GTGCACTTGT 
TTTGTGCAGA 
ATCCCCAGCG 
CAGTCAGTTC 
ATGAGTTTTG 
TGCTGGATTG 
ATGGGCTCTC 
TTCCATAGGA 



ATGGAGACTT 
ATGAACTGAA 
ATGGGTTCAC 
TAAACAGAAA 
ACTACGTCGG 
GCCGGAACTG 
GCTGCAGCCT 
ACCACGGCTT 
TTAAGGGTTG 
AGATTCATCT 
CACATAACCC 
T 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



Protein Accession*: 



SEQ ID NO:253 PftJS, Protein, sequence; 
NP_005696 



MHSTTPISSL FSFTSPAVKR LLGWKQGDEE EKWAEKAVDS LVKKLKKKKG AMDELERALS 60 
CPGQPSKCVT IPRSLDGRLQ VSHRKGLPHV IYCRVWRWPD LQSHHELKPL ECCEFPFGSK 120 
QKEVCINPYH YRRVETPVLP PVLVPRHSEY NPQLSLLAKF RS ASLHSEPL MPHNATYPDS 180 
FQQPPCSALP PSPSHAFSQS PCTASYPHSP GSPSEPESPY QHS VDTPPLP YHATEASETQ 240 
SGQPVDATAD RHWLSIPNG DFRPVCYEEP QHWCSVAYYE LNNRVGETFQ ASSRSVUDG 300 
FTDPSNNRNR PCLGLLSNVN RNSTTENTRR HIGKGVHLYY VGGEVYAECV SDSSffVQSR 360 
NCNYQHGFHP ATVCKIPSGC SLKVFNNQLF AQLLAQS VHH GFEWYELTK MCTTRMSFVK 420 
GWGAEYHRQD VTSTPCWIEI HLHGPLQWLD KVLTQMGSPH NP1SSVS 



SEQ ID NO:2S4 PBJB DNA sequence 
Nucleic Acid Accession*: AB04684 

Coding sequence: 472-4377 (underlined sequence corresponds to start and stop codon) 



TGCAGGTTTG 
GATGGACAAG 
TGGGGACACT 
ACCAAAATGA 
GCATGCTTTT 
TAAAGGGTGT 
TGCTTTTAAC 
TCTAAAGTCT 
GGGGATATGA 
GTCGATCCTA 
AATGCTCACG 
ATCGTCAAGA 
CCCACTGGCA 
AAAGATGGAG 
TCGACATTCA 
ATTGAGGTGG 
TTGACGGGGT 
TCCAGCAAAA 
GAAACAGAAG 
GAGGATAAAT 
CTGAGCTCCG 
TCCTCCAAGC 
TCAGACTCCT 
GTAAATGACA 
ACCAAAAAAC 
AGCAAAGGAT 
AAAACCATTA 
GTGGATCTTG 
ACATCCCTTC 
CTCCAGTCTG 
ACAATCAAGC 
CAAGTCATTA 
GCCTCTGTCC 
ACTGTCGTGG 
CTTGCCAACC 
CTAACCAAAC 
CCCAAAAAGG 
TTCAACAAGG 
CCCGCCAATG 
TCCTTTGCAC 
GTAACGTGCA 
CATGCOCGTG 
CCAGTCCCAG 
ACTCTTCAGA 
ACTGGGACAG 
GAAGACCCCT 
CAGGACGAGA 



11 
I 

CAGGGTCTGA 
GAGCTGAGAT 
TGGTTGATGC 
ACTGAGGGGT 
TTTCTCTTGC 
TGGGGATGCT 
CCTTTCCTAT 
TTCCAAAAGG 
AGACCCCAGA 
AAGCAGCTAT 
GAGAGGATGA 
ATGTTCGGAA 
ATGGCTTACA 
CAAAGTCCTT 
GCCAGTTTAG 
ATGACCCCCC 
CGGCTCCCCA 
CTGGACTCTC 
CCAGTTCTAT 
TGAAGGAAAG 
AGAAGAATGA 
TCTCGTCCTG 
GCAAAGAACC 
GTCCGAGAGC 
CATCCCTGAA 
CCCCGTCCTC 
AGACATCTTC 
ACTCTGGAAA 
TGTCGTCTCC 
CGGTCGTGAC 
CTGTGGCTAC 
ATTTGAAGCT 
AGAGTGCCAG 
TGCCGGCATC 
TTAACCTTTT 
CTCAGCAACA 
TGTCTCGAGT 
TGCTGAGCAG 
CAGGGATCAC 
TTGAAAAGAG 
ACCATTGTAC 
GGCATAAGGA 
CAGATCAAAT 
GCCCTGTGGG 
TCATATCGGC 
CCAAACTGTG 
CATCACTGGC 



21 
I 

GATTACTTGG 
TTATGACCCT 
AGTCTCTCTC 
TTGTAATGGT 
AGGGTATGTT 
TCTGACTATT 
TTGCTGAGAA 
TCAAGGTTCA 
CTTTGATGAC 
TGAGTCTGGA 
CTCCCACGCA 
CATTGACTCT 
TAATGGGTTT 
GAAAGGAGAT 
CCCGATCTCC 
TGACAAGGAG 
GCAGGACTAC 
TACGTCAGGC 
AAACCTGAGT 
CTCTGACAAG 
CACCAGCCTC 
CATCGCTGCC 
AGTGGCCAAT 
CGCTGACAAG 
GCAACCGGAT 
TCCCGCAGGG 
TGGGGAAATC 
GAAACCTTCC 
AGCATCAGCC 
CAATGCAGTT 
TGCTTTCCTC 
CGCTAACAAC 
CAGCGCCATC 
CAGCCTGGCC 
GCCTCAGGGT 
AATAAAGCAG 
CCAGGTGGTG 
TGTCAATCCA 
GTTACCGACG 
TCTGACCCAG 
AAAGAACCTC 
GAAAGGGGTG 
GATAGTTTCT 
AGCTGGCACA 
TCCTTCAAGC 
TAGACATAGT 
TACACATTTC 



31 
I 

GCTTTTCCTG 
TATTAGAGAA 
TCTCTTTCTC 
AGTTTGTTTG 



ATGAAGGCCA 
TGCAGCCGTG 
CAAGAACATC 
CTCCTGGCAG 
CACGATGACC 
CCATCATCTT 
TCCGAGGGCG 
CTCACAGCAT 
GTGCCTGCCT 
AGTGCTGAAG 
GACATGCGAT 
GATAAGCTGA 
AATGTGGAGA 
GTTTATGAAC 
GTGCTGGAAA 
CCCAGCGTTG 
ATCGCGGCTC 
TCGAGGGAAT 
TCTCCTGAAT 
AGTCCCAGAA 
TCCACACCAG 
AAGAGAACAG 
GAGCAGACAG 
GCCGTCCTTT 
TCCCCTGCAG 
CCAGTGTCTG 
ACCACGGTGA 
ATTAAAGCTG 
AATGCCAAAC 
GCCCAGGCCA 
GCAATAATCA 
TCGTCCTTGC 
GTCCCTGTTT 
CGTGGGTACA 
CACTACGACA 
GTTTTTTACA 
GTAATGCAAT 
CCGTCAAGCA 
CACACTGTCA 
ACTCCCATCA 
CTAAAATGTT 
CAGCAGGCTG 



41 

I 

CCTTTTTCTT 
AAAAATGTGC 
GGTGTTTATA 
TTGCTGGAGA 
CTTTTTCTTT 
AAAGGCCTGT 
TGACAGTAAC 
TGCTCAAATT 
CATTTGACAT 
ATGAAAGCCA 
CTGATGTGGG 
GGGAGAAAGA 
CCTCCCTTGA 
CTGAGGTGAC 
AGTTTGATGA 
CAAGCTTCAG 
AGGCACTCGG 
AAAACAAAGC 
CTTTTAAAGT 
ACAGAGTCCT 
CGCCATCAAA 
TCAGCGCTAA 
CCTCCCCGTT 
CCCAGAATCT 
GCATCTCAAG 
CAATCCCCAA 
TGACCAGGGT 
CGTCCGTGAT 
CCTCTCCCCC 
AGCTCACCCC 
CTGTGAAGAC 
AAGCCACGGT 
CCAACGCCAT 
TCGTGCCAAA 
CCTCTGAACT 
ATGCAGCAGC 
AGAGTTCTGT 
ACATCCCAAA 
AGTGCTTGGA 
GACGGAGCGT 
ACAAATGCAG 
GCTCCCACTT 
ATACTTCCAC 
CAAAAATTCA 
CCCCAGCCAT 
TGGAGTGTAA 
CAGATACGAG 



51 
I 

TTGCTTAAGG 
CTTGCTAGGG 
ACAAAACAAA 
ATGCTACTTT 
TAGAAGCTAC 
TGACTGGGGC 
TGAACATTGG 
AATGACCATG 
CCCAGATATG 
CATGAAGCAG 
TGTCAGCGTT 
CGGCCACAAC 
CAGTTACAGT 
ACTGAAAGAC 
CGACGAGAAG 
GTCGAATGTG 
AGGGGAAAAC 
TGTTAAGAGA 
CAGAAAAGCA 
AGATGGGAAG 
GACAAAGTCG 
AAAGGCGGCT 
ACCAAAAGAA 
CATCGACGGG 
TGAGAACAGC 
AGTCCGCATA 
ATTGCCAGAA 
GGCCTCTGTG 
CAGGGCGCCT 
CAAACAGGTC 
GGCAGGATCC 
CATATCTGCT 
CCAGCAGCAA 
GACTGTGCAC 
CCGCCAAGTG 
CTCGCAACCC 
GGTGGAAGCT 
CCTCAGTCCT 
GTGTGGGGAC 
GCGCATCGAA 
CCTCCTTTCC 
AATTTTAAAG 
TTCAACTTCC 
GTCTGGCATA 
GCCCCTAGAT 
TGAAGTCTTC 
TGGACAAAAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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ACTTGCACTA TCTGCCAGAT GCTGCTTCCT AACCAGTGCA GTTATGCATC ACACCAGAGA 2880 

ATCCATCAGC ACAAATCTCC CTACACCTGC CCTGAGTGTG GGGCCATCTG CAGGTCGGTG 2940 

CACTTCCAGA CCCACGTCAC CAAGAACTGT CTGCACTACA CGAGGAGAGT TGGTTTTCGA 3000 

TGTGTGCATT GCAATGTTGT GTACTCTGAT GTGGCTGCTC TGAAGTCTCA CATTCAAGGT 3060 

TCTCACTGTG AAGTCTTCTA CAAGTGTCCT ATTTGTCCAA TGGCGTTTAA GTCTGCCCCA 3120 

AGCACACATT CCCACGCCTA CACACAGCAT CCTGGCATCA AGATAGGAGA ACCAAAAATA 3180 

ATATATAAGT GTTCCATGTG CGACACTGTG TTCACCCTGC AAACCTTGCT GTATCGCCAC 3240 

TTTGACCAAC ACATTGAAAA CCAGAAGGTG TCTGTTTTCA AGTGTCCAGA CTGTTCTCTT 3300 

TTATATGCAC AGAAGCAACT TATGATGGAC CATATCAAGT CTATGCATGG AACATTGAAA 3360 

AGTATTGAAG GGCCTCCAAA CTTGGGTATA AACTTGCCTT TGAGCATTAA GCCTGCAACT 3420 

CAAAATTCAG CAAATCAGAA CAAAGAGGAC ACCAAATCCA TGAATGGGAA AGAGAAATTG 3480 

GAAAAGAAAT CTCCATCTCC TGTGAAAAAA TCAATGGAAA CCAAGAAAGT GGCCAGTCCT 3540 

GGGTGGACGT GTTGGGAGTG TGACTGCCTG TTCATGCAGA GAGATGTGTA CATATCCCAC 3600 

GTGAGGAAGG AGCACGGGAA GCAAATGAAG AAACACCCCT GCCGCCAGTG TGACAAGTCT 3660 

TTCAGCTCGT CCCACAGCCT GTGCCGGCAC AACCGGATCA AGCACAAAGG CATCAGGAAA 3720 

GTGTACGCCT GCTCGCACTG CCCAGACTCC AGACGTACCT TTACCAAACG TTTGATGCTG 3780 

GAGAAGCACG TCCAGCTGAT GCATGGCATC AAGGACCCTG ACCTGAAAGA AATGACAGAT 3840 

GCCACCAATG AGGAGGAAAC AGAAATAAAA GAAGACACTA AGGTCCCCAG TCCCAAGCGG 3900 

AAGTTGGAAG AACCAGTTCT GGAGTTCAGG CCTCCCCGAG GAGCAATCAC TCAACCACTG 3960 

AAAAAGCTGA AAATCAATGT TTTTAAGGTT CACAAGTGTG CCGTGTGTGG CTTCACCACC 4020 

GAAAACCTGC TGCAATTCCA CGAACACATC CCTCAGCACA AATCGGATGG TTCTTCCTAC 4080 

CAGTGCCGGG AGTGTGGCCT CTGCTACACG TCTCACGTCT CTCTGTCCAG GCACCTCTTC 4140 

ATCGTACACA AGTTAAAGGA ACCTCAGCCA GTGTCCAAGC AAAATGGGGC TGGGGAAGAT 4200 

AACCAACAGG AGAACAAACC CAGCCACGAG GATGAATCCC CTGATGGCGC CGTGTCAGAC 4260 

AGAAAGTGCA AAGTGTGCGC AAAAACTTTT GAAACTGAAG CTGCCTTAAA TACTCACATG 4320 

CGGACACACG GCATGGCCTT CATCAAATCC AAAAGGATGA GCTCAGCCGA GAAATAGCCA 4380 

CAGATGCTCC ATGAGGAAAA TCCCTGTCCA CATTGGAATA AAAAAGACAT TTTTGTTACA 4440 

AAGTTTGCAG TATAATAGAG TTAACAGTAC TGTCTAGGCT GTTGCAATAT ATTCTCTTTC 4500 

AATGTACCTT CCTTCACCTC GTCGTATATA TCCTCGATAA GTATTAAAAC AGTATTTGAG 4560 

TTTAAAAGAG TTTGTATATA TTTAAATGAA TAACTTTTTA TACTCTTTGT TACATGTTTG 4620 

TATCAGTATT TAGTGGAAAA CCATTTGAGT TGTTTTGGGT TAGAATTTTT CTTTTTGTAC 4680 

TGTTTCTTTA AAACAGAGTT CTXAGTAACA GGGGCAGTTC CTGAATTCAA ATAAACCATT 4740 

TTGTATGTTT GGATTTTGAA TGGGTTAACT AATTACAGGC TAAAATAATG CCTTTTTTAG 4800 

TGTTTTTAAT TTTTAGAATT CACTACATAA ATTGTAAGTA ATTGTGGGTC TCAAAAACAC 4660 

TAGGAACTTT TAAGTGTCTT AGCACTTCCT CGATGTGCCT GCCCTGAGGG AGTGAGTTCA 4920 

CATTTGAGAC AACTGCACTC CAGTGTGGAC GTGCCTTTGT CTTCAGGCCA TGCCGAAGGG 4980 

TGTTTAAAGC AGTCTTGCAG GTCGCTCCTT TCCCAGCCGT GGATAAAAAC TGAAGCTAGG 5040 

AATCTAATAA GGAATGCTGA TTTCCTCAGT TCCATTTTGA GGAATGGGGA AGGCTATTCT 5100 

AAAGAAAAAA ATGGGATTTG TTTTCTCGGC AGATCTGCAA GGCTGGCTTT AAGAGCACAA 5160 

GGAGGGAAAG TAACGAAAGG GCTGGACTAC TATAAAAGTT ACAAATACGT AGTTAGACCA 5220 

ATAGATTTAT ATAGTCAGGT TTTTGTCATG TAATTTATTA ACTAACTATT ACAGAAACAC 5280 

AGCTAAGAAT ATCAAGTATT TCTCTGGCTC TTGACAGAAA AAAATCAGTT GACTTAACCC 5340 

TTTGCTGTCA AAAGAGTTGG CGTTTCCTGT TCTGGGTGCT ACTGCCAAAC GTTATGGTAC 5400 

TTAGAGTCGG GATGCACAAC TTCAACCACC GACTTATCAA TGCAGCCGCC TGTGTATTGC 5460 

AATTGGCCGT TACCTTAAGC ACTGAGCCAC CCGGGTTTAG TTCAGCCATT TCAAGAAGTA 5520 

TATTTAACGT CGGTAGTTCT GCTTTATTAA AATGCAGCAG AGGTACTCTT CTGTCCCTTC 5580 

CGTTTATAGT TCTCTGAGAG AGTTCTATTT TTTGGTTTTG TTTTGTGTTT TCTTTTGCAT 5640 

TTTGTATCTT GTATTTATCC CTGAACATGT TTTGTACCTT T TTTTTTTTT TTTTTTTTAA 5700 

GAAAAGGAAT TCTTTTGTGT ATATATAGAT ACTTGCATGA TATACTGTAG TCAATGTTCG 5760 

GTTCCTCAAA AGGTCTTGCT GCTGTCAGGT GTTATGCACT CCATCCATCA TAACTGTATG 5820 
AAACACATTT CATATGTAAA TAAACGTGGG ACATTTG 



Protein Accession «: BAB13455 

MKTPDFDDLL AAFDIPDM VD PKAAIESGHD DHESHMKQNA HGEDDSHAPS SSD VG VSVIV 60 
KNVRNIDSSE GGEKDGHNPT GNGLHNGFLT ASSLDSYSKD GAKSLKGDVP ASEVTLKDST 120 
FSQFSPISS A EEFDDDEKIE VDDPPDKEDM RSSFRSNVLT GSAPQQDYDK LKALGGENSS 180 
KTGLSTSGNV EKNKAVKRET EAS51NLS VY EPFKVRKABD KLKESSDKVL ENRVLDGKLS 240 
SEKNDTSLPS VAPSKTKSSS KLSSCIAAIA ALSAKKAASD SCKEPVANSR ESSPLPKEVN 300 
DSPRAADKSP ESQNUDGTK KPSLKQPDSP RSISSENSSK GSPSSPAGST PAIPKVRIKT 360 
DCTSSGE1KR TVTRVLPEVD LDSGKKPSEQ TASVMASVTS LLSSPASAAV LSSPPRAPLQ 420 
SAWTNAVSP AELTPKQVTI KPVATAFLPV SAVKTAGSQV PWUCLANNTT VKATVIS AAS 480 
VQSASSAIIK AANAIQQQTV WPASSLANA KLVPKTVHLA NLNLLPQGAQ ATSELRQVLT 540 
KPQQQIKQAI INAAASQPPK KVSRVQVVSS LQSSWEAFN KVLSSVNPVP VYIPNLSPPA 600 
NAGITLPTRG YKCLECGDSF ALEKSLTQHY DRRSVRIEVT CNHCTKNLVF YNKCSLLSHA 660 
RGHKEKGWM QCSHULKPV PADQMIVSPS SNTSTSTSTL QSPVGAGTHT VTKIQSGITG 720 
TVISAPSSTP ITPAMPLDED PSKLCRHSLK CLECNEVFQD ETSLATHFQQ AADTSGQKTC 780 
TICQMLLPNQ CSYASHQRIH QHKSPYTCPE CG AICRS VHF QTHVTKNCLH YTRRVGFRCV 840 
HCNWYSDVA ALKSHIQGSH CEVFYKCPIC PMAFKSAPST HSHAYTQHPG DCIGEPKHY 900 
KCSMCDTVFT LQTLLYRHFD QHIENQK VSV FKCPDCSLLY AQKQLMMDHI KSMHGTLKSI 960 
EGPPNLGINL PLSIKPATQN SANQNKEDTK SMNGKEKLEK KSPSPVKKSM ETKKVASPG W 1020 
TCWECDCLFM QRDVYISHVR KEHGKQMKKH PCRQCDKSFS SSHSLCRHNR IKHKGIRKVY 1080 
ACSHCPDSRR TFTKRLMLEK HVQLMHGIKD PDLKEMTDAT NEEETEIKED TKVPSPKRKL 1140 
EEPVLEFRPP RGAITQPLKK LKINVFKVHK CAVCGFTTEN LLQFHEHIPQ HKSDGSSYQC 1200 
RECGLCYTSH VSLSRHLFIV HKLKEPQPVS KQNGAGEDNQ QENKPSHEDE SPDGAVSDRK 1260 
CKVCAKTFET EAALNTHMRT HGMAHKSKR MSSAEK 
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SEQ ID NO:256PBM1DNA sequence 
Nucleic Acid Accession*: AF111B47 

Coding sequence: 58*1608 (underlined sequence corresponds to start and stop codon) 

l 11 21 31 41 51 

I I I I I I 

TTTTCGTCGA CTCTTACCGG TTGGCTGGGC CAGCTGCGCC GCGGCTCACA GCTGACGATG 60 

GGGGACCCCA GCAAGCAGGA CATCTTGACC ATCTTCAAGC GCCTCCGCTC GGTGCCCACT 120 

AACAAGGTGT GTTTTGATTG TGGTGCCAAA AATCCCAGCT GGGCAAGCAT AACCTATGGA 180 

GTGTTCCTTT GCATTGATTG CTCAGGGTCC CACCGGTCAC TTGGTGTTCA CTTGAGTTTT 240 

ATTCGATCTA CAGAGTTGGA TTCCAACTGG TCATGGTTTC AGTTGCGATG CATGCAAGTC 300 

GGAGGAAACG CTAGTGCATC TTCCTTTTTT CATCAACATG GGTGTTGCAC CAATGACACC 360 

AATGCCAAGT ACAACAGTCG TGCTGCTCAG CTCTATAGGG AGAAAATCAA ATCGCTCGCC 420 

TCTCAAGCAA CACGGAAGCA TGGCACTGAT CTGTGGCTTG ATAGTTGTGT GGTTCCACCT 480 

TTGTCCCCTC CACCAAAGGA GGAAGATTTT TTTGCCTCTC ACGTTTCTCC TGAGGTGAGT 540 

GACACAGCGT GGGCATCAGC AATAGCAGAA CCATCTTCTT TAACATCAAG GCCTGTGGAA 600 

ACCACTTTGG AAAATAATGA AGGTGGACAA GAGCAAGGAC CAAGTGTGGA AGGTCTTAAT 660 

GTACCAACAA AGGCTACTTT AGAGGTATCC TCTATCATAA AAAAGAAACC AAATCAAGCT 720 

AAAAAAGGCC TTGGGGCCAA AAAAGGAAGT TTGGGAGCTC AGAAACTGGC AAACACATGC 780 

TTTAATGAAA TTGAAAAACA AGCTCAAGCT GCGGATAAAA TGAAGGAGCA GGAAGACCTG 840 

GCCAAGGTGG TATCTAAAGA AGAATCAATT GTTTCATCAT TAOGATTAGC CTATAAGGAT 900 

CTTGAAATTC AAATGAAGAA AGACGAAAAG ATGAACATTA GTGGCAAAAA AAATGTTGAC 960 

TCAGACAGAC TCGGCATGGG ATTTGGAAAT TGCAGAAGTG TTATTTCACA TTCAGTGACT 1020 

TCAGATATGC AGACCATAGA GCAGGAATCA CCCATTATGG CAAAACCAAG AAAAAAGTAT 1080 

AATGATGACA GTGACGATTC ATATTTTACT TCCAGCTCAA GTTACTTTGA CGAGCCAGTG 1140 
GAGTTAAGGA GCAGTTCTTT CTCTAGCTGG GATGACAGTT CAGATTCCTA TTGGAAAAAA . 1200 

GAGACCAGCA AAGATACTGA AACAGTTCTG AAAACCACAG GCTATTCAGA CAGACCTACT 1260 

GCTCGCCGCA AGCCAGATTA TGAGCCAGTT GAAAATACAG ATGAGGCCCA GAAGAAGTTT 1320 

GGCAATGTCA AGGCCATTTC ATGAGATATG TATTTTGGAA GACAATCCCA GGCTGATTAT 1380 

GAGACCAGGG CCCGCCTAGA GAGGCTGTCG GCAAGTTCCT CCATAAGCTC GGCTGATCTG 1440 

TTCGAGGAGC CGAGGAAGCA GCCAGCAGGG AACTACAGCC TGTCCAGTGT GCTGCCCAAC 1500 

GCCCCCGACA TGGCGCAGTT CAAGCAGGGA GTGAGATCGG TTGCTGGAAA ACTCTCOGTC 1560 

TTTGCTAATG GAGTCGTGAC TTCAATTCAG GATCGCTACG GTTCTTAATA CTGAAGTCAT 1620 

GATGTGTATT TCCTGGAGAA ATTCCTCTTT AAATGAAGAA GTAACCACAT CTCAGGCGGC 1680 

AGTGAAGTCC AGATAGTTTT GCAGATTGTT TTGCTACTTT TTCATATGGT ATATGTTTCT 1740 

GATTTTTAAT ATTTCTTTTG AGAAATTCTG AGTTCTGATG TAGGAGCTTT CCTGTGATTT 1800 

CTGTTTCACG TTCCTTCCTG TCACACCCTC CTTTGGCGTC TCTGTGTATA TCCTTGCTTT 1860 

ATTTTCTTGG AACCTTTGAT TTCAACACTG AGGGCCTGGA GACCTCGGCT CCTCCTGCTC 1920 

CTGAACCAGG AGGCTTCATG TGGGGGAGGA GGAGAGGTCT CCATGTGACA CATGGGCTCA 1980 

GGGCTGCCAG AATCAGCGGA TGCTGGATGG GCCTGCAGAA ACAACACTCA CCACACACAC 2040 

TTCCTTCAAA AGACCAAAAG TGACTGGTGT CTCGTGTGAC AGATTGCTTC ATTTATGTTT 2100 

CTACATAGTA AGGTGACTGC CAAATAATAT TTGAAGTCAT CTGTCTCTTT GTAAATTATT 2160 

TTATATGACC TATAAATTTA AAAATGTTTT TCAGTGAGTG CTTTTAACAA ACTTAAGCTT 2220 

CTGCCCTGCC AAGGGAATTA ATGTTATCTT GTGAAAGGTG TTGCTGTTTG AATTGATGAG 2280 

AAATGGAAGA TGAGAACTCC CTAAGAGTTC TCATAATAAA TCATCTCATC ACAAATCAAT 2340 

ACGGTATACA GAGTTAAAGT GGAATGAGGT AAGAAGATAC AGCTACAGAA AATAGTTGCG 2400 

TGTATGGGAG AACAGTCATT GTAATTGGGT AGTTTTGTTA ATAAATATTT TTAAATCTTG 2460 

CTTTTCAGAA ATTACCGAAT GTGTATAAAC AAATAAAGAA AAATAATTTA GCTGTGTTTT 2520 

AGACAGCATT AGAATATATT GTTCAGCACA CTAAAATATA TTTGAAATTT GATAAGCCAA 2580 

AAATGTGGTT TTGAATGAAT ATTTTGTGAA TCTTTCTTAA AAGCTCAAAT TTGTAGACTT 2640 

CTAAATAGAA TAAACACTTG CAGCAGAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2700 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2760 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



§EQ |P NQ;2$7 PPM1 MtfH sequence; 
PBM1 Protein sequence* CA876901 

MGDPSKQDIL TTFKRLRSVP TNKVCFDCGA KNPSWASITY GVFLCIDCSG SHRSLGVHLS 60 
FIRSTELDSN WSWPQLRCMQ VGGNAS ASSF FHQHGCSTND TNAKYNSRAA QLYREKIKSL 120 
ASQATRKHGT DLWLDSCWP PLSPPPKEED FFASHVSPEV SDTAWASAIA EPSSLTSRPV 180 
ETTLENNEGG QEQGPS VEGL NVPTKATLEV SSHKKKPNQ AKKGLGAKKG SLGAQKLANT 240 
CFNEEBKQAQ AADKMKEQED LAKWSKEES IVSSLRLAYK DLEIQMKKDE KMMSGKKNV 300 
DSDRLGMGFG NCRSVISHSV TSDMQTIEQE SPIMAKPRKK YNDDSDDSYF TSSSSYFDEP 360 
VELRSSSFSS WDDSSDS YWK KETSKDTETV LKTTGYSDRP TARRKPDYEP VENTDEAQKK 420 
FGNVKABSD MYFGRQSQAD YETRARLERL SASSSISSAD LFEEPRKQPA GNYSLSSVLP 480 
NAPDMAQFKQ GVRSVAGKLS VFANGVATTSI QDRYGS 



SEQ ID N0258 PBM4 DNA sequence 
Nucleic Acid Accession*: 030891 

Coding sequence: 1-4032 (underlined sequence corresponds to start and stop codon) 

ATGGATACTG TCATGAAGCA GACACATGCT GACACACCTG TTGATCATTG TCTATCTGGC 60 
ATAAGAAAGT GTAGCAGCAC CTTTAAGCTT AAAAGTGAAG TCAACAAGCA TGAAACAGCC 120 
CTTGAAATGC AG AATCCAAA TTTGAACAAT AAAGAATGTT GTTTCACCTT TACGTTG AAT 180 
GG AAACTCCA G AAAATTAGA CCGTAGTGTG TTTACAGCAT ATGGTAAACC CAGCGAG AGT 240 
ATCTACTCAG CCCTG AGTGC TAATG ACTAT TTCAGTGAAA GGATAAAGAA TCAGTTTAAT 300 
AAGAACATTA TTGTTTATGA AG AAAAGACA ATAGATGGAC ATATAAATTT AGGAATGCCT 360 
CTCAAGTGCC TGCCTAGTG A TTCTCATTTT AAAATTACAT TTGGTCAAAG AAAG AGTAGC 420 
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AAAGAAGATG G ACACATATT ACGCCAATGT GAAAATCCAA ACATGGAATG CATTCTTTTT 480 
CATGTTGTTG CTATAGGAAG GACAAGAAAG AAGATTGTTA AGATCAACGA ACTTCATGAA 540 
AAAGGAAGTA AACTITGTAT TTATGCCTTG AAGGGTGAGA CTATTG AAGG AGCCTTATGC 600 
AAGGATGGCC GTTTTCGGTC TG ACATAGGT GAATTTGAAT GGAAACTAAA GG AAGGTCAT 660 
AAGAAAATTT ATGG AAAACA GTCCATGGTG GATG AAGTAT CTGG AAAAGT CTTAGAAATG 720 
GACATTTCAA AAAA AAAAGC ATTACAACAG AAAGATATCC ATAAAAAAAT TAAACAGAAT 780 
GAAAGTGCCA CTGATGAAAT TAATCACCAG AGTCTGATAC AGTCTAAGAA AAAAGTCCAC 840 
AAACCAAAG A AAGATGGAGA GACCAAAGAT GTAG AACACA GCAGAGAGCA AATTCTCCCA 900 
CCTCAGGATC TAAGCCATTA TATTAAAGAT AAAACTCGCC AGACAATTCC CAGGATTAGA 960 
AATTATTACT TTTGTAGTTT GCCCCGAAAA TATAGGCAAA TAAACTCACA AGTTAGACGG 1020 
AGGCCGCATC TGGGTAGGCG GTATGCTATT AATCTGGATG TCCAAAAGGA GGCAATTAAT 1080 
CTCTTAAAGA ATTATCAA AC GTTGAATGAA GCCATAATGC ATCAGTATCC GAATTTTAAA 1 140 
GAGG AGGCAC AGTGGGTAAG AAAATATTTT CGGGAAGAAC AAAAGAGAAT GAATCTTTCA 1200 
CCAGCTAAGC AATTCAACAT ATATAAA AAG GACTTCGG AA AAATGACTGC A AATTCTGTT 1260 
TCAGTTGCAA CCTGOGAACA GCTTACATAT TATAGCAAGT CAGTTGGGTT CATGCAATGG 1320 
GACAATAATG GAAACACAGG TAATGCTACT TGCTTTGTCT TCAATGGTGG TTATATTTTC 1 380 
ACX3TGTCGAC ATGTTGTACA TCTTATGGTG GGTAAAAACA CACATGCAAG TTTGTGGOCA 1440 
GATATAATTA GCAAATGTGC G AAGGTAACC TTCACTTATA CAG AGTTCTG CCCTACTOCT 1500 
GACAATTGGT TTTCCATTGA GCCATGGCTT AAAGTGTCCA ATGAAAATCT AGATTATGCC 1560 
ATTTTAAAAC TAAAAGAAAA TGGAAATGCG TTTCCTCCAG GACTATGGCG ACAGATTTCT 1620 
CCTCAACCAT CTACTGGTTT GATTTATTTA ATTGGTCATC CTGAAGGCCA GATCAAGAAA 1680 
ATAGATGGTT GTACTGTGAT TCCTCTAAAC GAACGATTGA AAAAATATCC AAACGATTGT 1740 
CAAGATGGGT TGGTAG ATCT CTATG ATACC ACCAGTAATG TATACTGTAT GTTTACCCAA 1800 
AGAAGTTTCC TATCAGAGGT TTGG AACACA CACACGCTTA GTTATGATAC TTGTTTCTCT 1860 
GATGGGTCCT CAGGCTCCCC AGTGTTTAAT GCATCTGGCA AATTGGTTGC TTTGCATACC 1920 
TrrGGGCTTT TTTATCAACG AGGATTTAAT GTGCATGCCC TTATTGAATT TGGTTATTCT 1980 
ATGGATTCTA TTCTTTGTGA TATTAAAAAG ACAAATGAGA GCTTGTATAA ATCATTAAAT 2040 
GATGAGAAAC TTGAGACCTA CGATGAAGAG AAAGCCCGGC CCAGGCCAGC CTACCGGCGA 2100 
CTAGGATGCT TTCGCTTTCG CTCTCGCTTT CCAATACTCG GGACTGGGGA AACCGGGAGA 2160 
ATAGAAGCAG GCAAGGACCG CCGTGGGCAC GGGGTCAGTG AGACAGGGTC CTGCTCGCGG 2220 
CGTCAAGG AG GAGCGCTGTG GGTGTCCCCA GCGCAGCCAA TCGGCTTCCG AAGTAGCTGG 2280 
AGCTCTGGAG CCTTTGCTTC CTCAAATACG AGCGGGAACT GCGTTGAGCG CTGGATTOCA 2340 
GGCCGAGTGC TGGCGAGGCG CGCAGTCTCT AAAGAGCAAC AGAATAATTG CAGTACTTCT 2400 
CTAATGAGGA TGGAGTCTAG AGGAGACCCA AGAGCCACAA CTAATACCCA GGCTCAAAGA 2460 
TTCCATTCAC CTAAGAAAAA TCCAG AAGAC CAG ACCATGC CCCAAAATAG G ACAATATAT 2520 
GTTACCTTGA AGGCTGTCAG AAAAGAGATA GAAACTCACC AAGGCCAAGA AATGCTTGTG 2580 
CGTGGCACAG AAGGAATCAA AGAGTACATA AACCTTGGAA TGCCCCTCAG TTGTTTCCCT 2640 
GAAGGTGGCC AGGTGGTCAT TACATTTTCC CAAAGTAAAA GTAAGCAGAA GGAAGATAAC 2700 
CACATATTTG GCAGGCAGGA CAAAGCATCG ACTGAATGTG TCAAATTTTA CATTCATGCA 2760 
ATTGGAATTG GGAAGTGTAA AAGAAGGATT GTTAAATGTG GGAAGCTTCA CAAAAAGGGG 2820 
CGCAAACTCT GTGTTTATGC TTTCAAAGGA GAAACCATCA AGGATGCACT GTGCAAGGAT 2880 
GGCAGATTTC TTTCCTTTCT GGAGAATGAT GATTGGAAAC TCATTGAAAA CAATGACACC 2940 
ATTTTAGAAA GCACCCAGCC AGTTGATGAA TTAGAAGGCA GATACTTTCA GGTTGAGGTT 3000 
GAG AAAAGAA TGGTCCCCAG TGCAGCAGCT TCTCAGAATC CTGAGTCAGA G AAAAG AAAC 3060 
ACCTGTGTGT TG AG AGAACA AATCGTGGCT CAGTACCCCA GTTTGAAAAG AGAAAGTGAA 3120 
AAAATCATTG AAAACTTCAA GAAAAAAATG AAAGTAAAAA ATGGGGAAAC ATTATTTGAA 3180 
TTGCATAGAA CAAOGTTTGG GAAAGTAACA AAAAATTCTT CTTCGATTAA AGTAGTGAAA 3240 
CTTCTTGTAC GTCTCAGTGA CTCAGTTGGG TACTTATTCT GGGACAGTGC AACTACGGGT 3300 
TACGCCACCT GCTTTGTTTT TAAAGGATTG TTCATTTTAA CTTGTCGGCA TGTAATAGAT 3360 
AGCATTGTGG GAGACGGAAT AGAGCCAAGT AAGTGGGCAA CCATAATTGG TCAATGTGTA 3420 
AGGGTGACAT TTGGTTATGA AGAGCTAAAA GACAAGGAAA CAAACTACTT TTTTGTTGAA 3480 
CCTTGGTTTG AG ATACATAA TG AAGAGCTT GACTATGCTG TCCTGAAACT G AAGG AAAAT 3540 
GGACAACAAG TACCTATGGA ACTATATAAT GGAATTACTC CTGTGCCACT TAGTGGGTTG 3600 
ATACATATTA TTGGCCATCC ATATGG AGAA AAAAAGCAG A TTGATGCTTG TGCTGTGATC 3660 
CCICAGGGTC AGCGAGCAAA GAAATGTCAG GAACGTGTTC AGTCTAAAAA AGCAGAAAGT 3720 
CCAGAGTATG TCCATATGTA TACTCAAAGA AGTTTCCAGA AAATAGTTCA CAACCCTGAT 3780 
GTGATTACCT ATGACACTGA A 111 llC TTT GGGGCTTCCG GCTCCCCTGT GTTTG ATTCA 3840 
AAAGGTTCAT TGGTGGCCAT GCATGCTGCT GGCTTTGCTT ATACTTACCA AAATGAGACT 3900 
CGTAGTATCA TTGAGTTTGG CTCTACCATG GAATCCATCC TCCTTGATAT TAAGCAAAG A 3960 
CATAAACCAT GGTATG AAG A AGTATTTGTA AATCAGCAGG ATGTAGAAAT GATGAGTGAT 4020 
GAGGACTTGT GAGAATTCAG TCTACTGGAT TTAAGGGAAT GGCTTATGGA GTTGTTATTT 4080 
CGTAGGCATT GAAAATGGTT TTCTAAACTC CAAAATGGTC ATCTTATCAA TAATAATAAT 4140 
ATTGACCATT TCCTATCTGC CAGGCATTTT TCTAAGCACA TGAAG AAATT AGTCCTAACA 4200 
ACACTATGAG ATGGACTATA ACTTGCCCAA ATTTTTTTTT TTTTTGAGAC TGAGTCTCAC 4260 
TCTGTCGCCr GGGCTGGAGT ACAGTGGTGC GATCTC AGCT CACTGCAACT TCCACCTCCC 4320 
AGGTTCAAGC GATTCTTATG CCTCAGTCTC CTGAGCAGCT GGGATTACAG GCAAACGCCA 4380 
CCACACCCAG CTAAATTTTT 111 11111 U TGTATTTTTA GTAG AG ACAG GGTTTCACCA 4440 
TGTTGGTCAG GCGGGTCTCG AACTCCTGAC CTCGTGATCC ACCTGCCTCG GCCTTCCAAA 4500 
GTGCTGGGAT TACAAGTTTG AGCCACTGCA CCTGGCTAAC TTGCCCTATT TTAAAGTCAA 4560 
GCAATGGGAA GAATAACAAG ATTATATAGT AATCAGTTTC ATGACACTAA AAGTCATATA 4620 
GTCATAGGGT TTTTTCATCT TTCATATCTT TGCCTAAATT CATTTGCTAC AGTGCAGG AA 4680 
CCAA AACTTG TTCATCTCAT GATTCCCTAC ATCTGACATA AGGAAAGTAA GTGCTCAGAA 4740 
AAATGTGCAG GTCAATAAGTTGCAAAAGTTGGGGCTGCAA TTAATGCTAA CATAAGAGCT 4800 
AAATGCTTGA TTAGAAATGA TCTCAAAACC TTTTAG AATT TCCAAAATCT TCATATTACT 4860 
GAAACTGTCG GAATATATGG GTCCTGAAAT TCAGAAGATG ATAGT CACTC TTCCCATATT 4920 
TATAGGCTAT TAAGGCAAGG GATATCTTAA ACATCATATT ACTTTATTTA GATTTCTACT 4980 
ACTOCAATTA TTAATGTTAT GTATTTCTCA TTGTTTTACT TCTTCATGGT ATTATG AAG A 5040 
CTATATAGAT G ATTCAACCA AGCCTGCAAA TCTCCCTCTT GTGGAATTCC ACTGGACCCA 5100 
ATCTGTTTTC CATTTCCATT GCAATACTAC TAAAGCCATA CAATATCAAG CACCCTCCCT 5160 
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CTAGGTCCAG GGACTATCAC AGAAGAAGCA GGCATGTAAG ATTTTAAGGA CTGGTTTCGA 5220 
GGGGTCGAGT GTAGGAAAAC AGCCTGTTGC ATTGTAAGAG TGATGTCACC TTGAAGAGCA 5280 
GCTGGCATGA TGACTGCTGT TTGACTCCTG CATACCAAGA TATTCTGCAG CAATGTCTTT 5340 
AAACAGTGCC GGTAGTACAG ATAACCCCTC ATAAAGATGC TTATCTAACC TCCCCAGTGT 5400 
TCAGGTGTTT CACAAG AAAG TCTG AG ATAT G ACTAGCT AC ACGTTTTGCC AAAAATGCTT 5460 
GTTATATAAA GGGTACTTTT GGG AGGGTGA GTGCCGCCAT TTAGTGGCTG CTAGAAACAT 5520 
TGCTTCTGTT TGTAAGTTCC TATTAAATGT TCTTTCTGAG AAAAAAAAAA A 
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?eq id NPffig ppM4 Proton ?w?pk; 

PBM4 Protein sequence: BAB67788 

MDTVMKQTHA DTPVDHCLSG IRKCSSTFKL KSEVNKHETA LEMQNFNLNN KECCFTFTLN 60 
GNSRKLDRSV FTAYGKPSES IYSALSANDY FSERIKNQFN KNIIVYEEKT IDGHINLGMP 120 
LKCLPSDSHF KTTFGQRKSS KEDGHILRQC ENPNMECILF HWAIGRTRK KTVKINELHE 180 
KGSKLCIYAL KGETIEGAIjC KDGRFRSDIG EFEWKLKEGH KKIYGKQSMV DEVSGKVLEM 240 
DISKKKALQQ KDIHKKIKQN ESATDEINHQ SUQSKKKVH KPKKDGETKD VEHSREQILP 300 
PQDLSHY1KD KTRQTIPRIR NYYFCSLPRK YRQINSQVRR RPHLGRRYAI NLDVQKEAIN 360 
IXKNYQTLNE AIMHQYPNFK EEAQWVRKYF REEQKRMNLS PAKQFNIYKK DFGKMTANSV 420 
S VATCEQLTY YSKS VGFMQW DNNGNTGNAT CFVFNGGYIF TCRHWHLMV GKNTHPSLWP 480 
DIISKCAKVT FTYTEFCPTP DNWFSEEPWL KVSNENLDYA ILKLKENGNA FPPGLWRQIS 540 

pqpstgliyl ighpegqikk idgctvipln erlkkypndc qdglvdlydt tsnvycmftq 600 
rsflsevwnt htlsydtcfs dgssgspvfn asgklvalht fglfyqrgfn vhauepgys 660 
mdsilcdikk tneslyksln dekletydee karprpa yrr lgcfrfrsrf pujgtgetgr 720 
eagkdrrgh gvsetgscsr rqggalwvsp aqpigfrssw ssgafassnt sgncverwip 780 

GRVLARRAVS KEQQNNCSTS LMRMESRGDP RATTNTQAQR FHSPKKNPED QTMPQNRTIY 840 
VTLKA VRKEI ETHQGQEMLV RGTEGIKEYI NLGMPLSCFP EGGQWITFS QSKSKQKEDN 900 
HIFGRQDKAS TECVKFYIHA IGIGKCKRR1 VKCGKLHKKG RKLCVYAFKG ETIKDALCKD 960 
GRFLSFLEND DWKLIENNDT ILESTQPVDE LEGRYFQVEV EKRMVPS AAA SQNPESEKRN 1020 
TCVLREQIVA QYPSLKRESE KHENFKKKM KVKNGETLFE LHRTTPGKVT KNSSSIKWK 1080 
LLVRLSDSVG YLFWDS ATTG YATCFVFKGL FILTCRHVID SIVGDGIEPS KWATHGQCV 1140 
RVTPGYEELK DKETNYFFVE PWFEIHNEEL DYAVLKLKEN GQQVPMELYN GITPVPLSGL 1200 
mnGHFYGE KKQEDACAVI PQGQRAKKCQ ERVQSKKAES PEYVHMYTQR SFQKTVHNFD 1260 
VTTYDTEFFF GASGSPVFDS KGSLVAMHAA GFAYTYQNET RSIIEFGSTM ESILLDIKQR 1320 
HKPWYEEVFV NQQDVEMMSD EDL 



SEP ID NO:260 PBQ1 DMA sequence 
Nucleic Acid Accession*: NMJJ15642 

Coding sequence: 489-2489 (underlined sequence corresponds to start and stop codon) 



ACATTTCAAA 
TACGAAGAAT 
CTCATGACAT 
TGCAGCCGCT 
AATTTACCTG 
AAAACCAGAA 
CGGGCCTTCC 
CAACACATTC 
GCAAGGGGAT 
TCGAGACCCT 
ACGGGAGCAT 
ACAAACTGCT 
TGCAAAAGCT 
TGCAGATCCT 
GCATCGTGTC 
CGCCGCGGGG 
ACCTGCAGAG 
CCATGCAGAA 
AGACTGCGCT 
TCCATGAGCG 
GCCGCAAGCA 
AGGAGATGGA 
ACGAATCCGA 
AAGGTGAAAG 
AGCAGCAGTT 
AGGCTGCAGA 
CCTCTCCGGA 
GCTCCGACAA 
CAAGTACCCA 
TGACCTTGAC 
TCTTCACTAC 
CCCTGGCAGG 
CTGCACAGCT 
GGCAAGGCGA 
AGAACTACGT 
GTTGGCGCTC 



11 
I 

AAAAATACAT 
GAACTCTGAG 
TGCTGTCTGA 
CTCTGCTCCC 
AAGAGTGACA 
GGCATCTGAG 
CTGCCTGAAC 
ACTGACAAAC 
GACCGAGCGC 
CAACGAGCAG 
GCTGCGCGCA 
GCTTGGCTAC 
CATTGACTTC 
CACGGCCGCC 
ACAGAACGTG 
CACTCCCGAG 
CCACCCACAG 
TGGCAGCGGC 
CGGCCTGCCC 
CTCGCAGCAG 
GCCCCGGCCT 
GGACGATTAC 
GGAGTGCACG 
CTTCGACTCG 
TGGGCCTGGG 
AGCCCCCGCT 
GAGAAGCAAT 
GAGCGTCCTA 
GCTCTACTTA 
CAGCAACACG 
CCAGCCCGCG 
CCAGCAGACC 
GCCAGCGCCA 
AAAAAAGCCT 
CAAGCACATG 
CTTCTCCTTA 



21 
I 

AGACTGATGT 
AATGTTTGGA 
TCTTTGACCA 
TGCCCCAATG 
CCATTGATTT 
GAGAATGAGA 
TTTGAAGCTG 
TCTCACGCTC 
ATTCACAGCA 
CGCAACCGTG 
CACCGCTGCG 
AGCGACATCG 
ATGTACAGCG 
AGCATCCTGC 
GGCGATGTGT 
TCAGGCACGT 
CACAGCGTGG 
GAGCGCTCTT 
CGCGACCACC 
ATGGAGCGCT 
GTGCGCATCC 
GACTACTACG 
GAAGACACAG 
GGCGTCAGCT 
GCGGCGCGGG 
GAGGGTGGTC 
GAAGTGGAGA 
CAACAGCCTT 
CGCCAGACAG 
CAGGTCATTG 
GGCAGTGGCC 
CAGTTTGTGA 
CAGCCCCTGG 
TATGAGTGCA 
TTCGTACACA 
AAGGATTACC 



31 
I 

TTCAGACTTG 
GAATGTTTCA 
TCAGTCTGTG 
AACATCTGCA 
TGAAACTACT 
TTACTCAGCC 
TTTTGTCTCC 
ACACCGGGTC 
TCAACCTTCA 
GCCACTTCTG 
TGCTGGCAGC 
AGATCCCGTC 
GCGTGCTACG 
AGATCAAAAC 
TCCCGGGGAT 
CAGGCCAGAG 
ACAGGATCTA 
TTTACAGCGG 
ACATGGAAGA 
ACCTGTCCAC 
AGACCCTAGT 
GGCAGCAAAG 
ACCAGGCCGA 
CCTCCATAGG 
ACAGCCAGGC 
CGCAGACAAA 
TGGACAGCAC 
CGGTCAACAC 
AAACCCTCAC 
GCACAGCTGG 
CCAAGCCTTT 
CAGTGTCCCA 
CCTCATCCGC 
CTCTCTGCAA 
CAGGTGAGAA 
TTATCAAGCA 



41 
I 

TGCAGCATAA 
TCATTACTAA 
ACCTGCCCCT 
CTAGGCCCAA 
GAAGAAACCC 
GGGTGGATCC 
AGACCCAGOC 
ATCTGATTGT 
CAACTTCAGC 
TGACGTAACG 
CGGCAGCCCC 
GGTGGTGTCA 
GGTCTCGCAG 
AGTCATCGAC 
CCAGGACTCG 
CAGCGACACG 
CTCGGCACTC 
CGCAGTGGTC 
CCCCAGCTGG 
CACCCCCGAG 
GGGCAACATC 
GGTGCAGATC 
GGGCACCGAG 
CACCGAGCCT 
TGAACCCACC 
CCAGCTAGAA 
TGTTATCACT 
GTCCATCGGG 
CAGCAACCTG 
CAACACCTAC 
CCTCTTCAGC 
GCCCGGTCTG 
AGGCCACAGC 
CAAGACTTTC 
GCCCCACCAA 
CATGGTGACA 



51 
I 

GCCTACAGGG 
CAGGATATTC 
TCTCTTTACA 
GCCTTGGAGT 
AAGACAGCTG 
AGCGCCAAGC 
CTCATCCACT 
GACATCAGTT 
AATTCCGTGC 
GTGCGCATCC 
TTCTTCCAGG 
GTGCAGTCAG 
TCGGAAGCTC 
GAGTGCACGC 
GGCCAGGACA 
GAGTCGGGCT 
TACGCGTGCT 
AGCCACCACG 
ATCACACGCA 
ACCACGCACT 
CACATCAAGC 
CTGGAACGCA 
AGTGAGCCCA 
GACTCGGTGG 
CAACCCGAGC 
ACAGGTGCTT 
GTCAGCAACA 
CAGCCATTGC 
AGGATGCCTC 
CTGCCAGCCC 
CTGCCACAGC 
TCGACCTTTA 
ACAGCCAGTG 
ACCGCCAAAC 
TGCAGCATCT 
CACACAGGAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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TGAGGGCATA CCAGTGTAGT ATCTGCAACA 
TGCACATGCG CCTCCACCGG GGAGAGAAGT 
TCTCTCACAA GACCCTCCTG GAGCGACACG 
CCCCTGCAGG CACACCCCCA GGTGCCCGCG 
AGGGGACCAC TTACGTCTGC TCCGTCTGCC 
ACGACCACAT GAGGATGCAT GTGTCTGACG 
AACAAAACAA AACAACAACA AAAAACAAAC 
GAAATGTTTT GGTTTCATTT TTACTTTCTG 
TACATGAAGA ACTGTTTTTT GCCTGCTGGT 
ATAGTTTTCC CAGTCTCCCT CGGATGGTGG 
CACTGGTTGG ATCTCTAGCT ACTGGCCTCT 
AAAAAAAAA 



AGCGCTTCAC CCAGAAGAGC TCCCTCAACG 2220 

CCTACGAGTG CTACATCTGC AAAAAGAAGT 2280 

TGGCCCTGCA CAGTGCCAGC AATGGGACCC 2340 

CTGGCCCCCC AGGCGTGGTG GCCTGCACGG 2400 

CAGCAAAGTT TGACCAAATC GAGCAGTTCA 2460 

GATAAGTAGT ATCTTTCTCT CTTTCTTATG 2520 

AAACAAAAAA GCTATGGCAC TAGAATTTAA 2580 

TTTTTGTTTT TGTTTCGTTT CATTTTGTAC 2640 

ACATTACATT TCCGGAGGCT TGGGTGAATA 2700 

CCTTAAGGCC TGGTAGTGCT TCAAGAGGTC 2760 

AAATACAACC CTTCTTTACA AAAAAAAAAA 2820 



PBQ1 Protein sequence NP_05S457 

MTERIHSINL HNFSNSVLET LNEQRNRGHF CDVTVRIHGS MLRAHRCVLA AGSPFPQDKL 60 
LLGYSDIEIP SWSVQSVQK LIDFMYSGVL RVSQSEALQI LTAASHjQIK TVIDECTRIV 120 
SQNVGDVFPG IQDSGQDTPR GTPESGTSGQ SSDTESGYLQ SHPQHSVDRI YSALYACSMQ 180 
NGSGERSFYS G AWSHHETA LGLPRDHHME DPS WTTRIHE RSQQMERYLS TTPETTHCRK 240 
QPRPVRIQTL VGNIHIKQEM EDDYDYYGQQ RVQILERNES EECTEDTDQA EGTESEPKGE 300 
SFDSGVSSSI GTEPDSVEQQ PGPGAARDSQ AEPTQPEQAA EAPAEGGPQT NQLETGASSP 360 
ERSNEVEMDS TVITVSNSSD KSVLQQPSVN TSIGQPLPST QLYLRQTETL TSNLRMPLTL 420 
TSNTQVIGTA GNTVLPALFT TQPAGSGPKP FLFSLPQPLA GQQTQFVTVS QPGLSTFTAQ 480 
LPAPQPLASS AGHSTASGQG EKKPYECTLC NKTFTAKQNY VKHMFVHTGE KPHQCSICWR 540 
SFSLKDYLIK HMVTHTGVRA YQCSICNKRF TQKSSLNVHM RLHRGEKSYE CYICKKKFSH 600 
KTLLERHVAL HSASNGTPPA GTPPGARAGP PGWACTEGT TYVCS VCPAK FDQEEQFNDH 660 
MRMHVSDG 



SEQ f D NO: 262 PBQ6 DNA sequence 
Nucleic Acid Accession* AI654187 

Cooing sequence: 1-912 (undefined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

ATGGTGGAAG AGGAAACAGG CATATCTTAC ATGGTGGCAG ACAAGGGACA CCCTTCTACA 60 

AACTCTACCA CTTCTGCGCC GTCGTTTCGA CCATATAAAA ACGACCTATG CGAACTGCGT 120 

CGGAAAACTC CCTCACGATG TAAAACGAAG ATCAGGAGCA GATTTGAAGA ATTACAAAGT 180 

GAATTGGTGC CAGTCAGCAT GTCAGAGACA GACCACATAG CCTCTACTTC CTCTGATAAA 240 

AATGTTGGGA AAACACCTGA ATTAAAGGAA GACTCATGCA ACTTGTTTTC TGGCAATGAA 300 

AGCAGCAAAT TAGAAAATGA GTCCAAACTA TTGTCATTAA ACACTGATAA AACTTTATGT 360 

CAACCTAATG AGCATAATAA TCGAATTGAA GCCCAGGAAA ATTATATTCC AGATCATGGT 420 

GGAGGTGAGG ATTCTTGTGC CAAAACAGAC ACAGGCTCAG AAAATTCTGA ACAAATAGCT 480 

AATTTTCCTA GTGGAAATTT TGCTAAACAT ATTTCAAAAA CAAATGAAAC AGAACAGAAA 540 

GTAACACAAA TATTGGTCGA ATTAAGGTCA TCTACATTTC CAGAATCAGC TAATGAAAAG 600 

ACTTATTCAG AAAGCCCCTA TGATACAGAC TGCACCAAGA AATTTATTTC AAAAATAAAG 660 

AGCGTTTCAG CATCAGAGGA TTTGTTGGAA GAAATAGAAT CTGAGCTCTT ATCTACGGAG 720 

TTTGCAGAAC ATCGAGTACC AAATGGAATG AATAAGGGAG AACATGCATT AGTTCTGTTT 780 

GAAAAGTGTG TGCAAGATAA ATATTTGCAG CAGGAACATA TCATAAAAAA GGCCAGACTT 840 

GGTCTCTGTT ATTTGCCATC AAGAACCTCA ATTGACACGT TAATTCCGTT TATCCCAAAT 900 
TTATATAGAT AA 



SEQ tD M0363 PBQ6 Protein sequence: 
Protein Accession #: NP_060170 

MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 
NQKLQEKMTP QGECSVAHTL TFEEEHHMKR MMAKREKUK EUQ1BKDYL NDLELCVREV 120 
VQPLKNKKTD RLDVDSLFSN IESVHQISAX LLSLLEEATT DVEPAMQVIG EVFLQIKOPL 180 
EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 



gEQ ID NQ:2g4 PPY7 PNA sequence 
Nucleic Acid Accession* NM.014323 

Coding sequence: 662-2725 (underlined sequence corresponds to start and slop codon) 
1 11 21 31 41 51 

I I I I I I 

GGGCCTACTC TGCCGCCGCC GCCGCCCGCC CGCTCCAGCC GCCGCCGCCG CCGCCACCGC 60 

CCTCCAGGCT CCGGGACCCG GCCCGCGCCA CCGCCCCCGT GCGCGCCCCG CCGCCGCCGC 120 

CTTCGCCTTC GCCTTTTGTT TCCTCCGCTC CGGCGCCCCC GCCCCGGCTC GCGCTTTGCA 180 

GGGGACGCAG CGCGCGCCCC CAGCGGGCCC GGGAAAAGCC GCGGCGCGCG CGCGCGCCTG 240 

CGOGGCGGAC CCCTCCTTCT CCTCCCCGCG TGCGCGTGCC CTTCTTGGCT GCGCGCCGGC 300 

GCCGCCTGGC GGGCGGGAGG GGAGGTGGCA GGCGCGTTTG CAGGAGGGGC GCACCTCTTC 360 

GCTCGCGCAC CCCCCCGGAA GGTAGACCGG GAAGGGGAGG CGGGCGGGCG GAGAGGAGAG 420 

AGTGGCGCGC AGTCCAGCGA GGGCGGGGGT TGGCTATGTG GGGGGTGGTG CACCCCGCAG 480 

TCTAGACAGT CTGATCCGGG CTGGGGGCGT GTACACTCGG CGCACCTGCG AGACTACAGA 540 

GCCTCGGGCC GGCACGTGTG GGGAGTGTGG ACACGTCTGC TGCGCCCCGC TTCTCGCTGC 600 
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TGAGGGGAAG GGAGGGGGCG GGCAGGTGCA GCGGCCGGGC TAGTGGGAGG GGGCGGCGGC 660 

CATGGAGCGG GTGAACGACG CTTCGTGCGG CCCGTCTGGC TGCTACACAT ACCAGGTGAG 720 

CAGACACAGC ACGGAGATGC TGCACAACCT GAACCAGCAG CGCAAAAACG GCGGGCGCTT 780 

CTGCGACGTG CTCTTGCGGG TAGGCGACGA GAGCTTCCCA GCGCACCGCG CCGTGCTGGC 840 

CGCCTGCAGC GAGTACTTTG AGTCGGTGTT CAGCGCCCAG TTGGGCGACG GCGGAGCTGC 900 

GGACGGGGGT CCGGCTGATG TAGGGGGCGC GACGGCAGCA CCAGGCGGCG GGGCCGGGGG 960 

CAGCCGGGAG CTGGAGATGC ACACTATCAG CTCCAAGGTA TTTGGGGACA TTCTGGACTT 1020 

CGCCTACACT TCCCGCATCG TGGTGCGCTT GGAGAGCTTT CCCGAACTCA TGACGGCCGC 1080 

CAAGTTCCTG CTGATGAGGT CGGTTATCGA GATCTGCCAG GAAGTCATCA AACAGTCCAA 1140 

CGTACAGATC CTGGTACCCC CTGCOCGCGC CGATATAATG CTCTTTCGCC CCCCTGGGAC 1200 

CTCGGACTTG GGCTTCCCTT TGGACATGAC CAACGGGGCA GCCTTGGCAG CCAACAGCAA 1260 

TGGCATCGCC GGCAGCATGC AGCCAGAGGA GGAGGCAGCT CGGGCGGCTG GTGCAGCCAT 1320 

TGCAGGCCAA GCCTCTTTGC CTGTGTTACC TGGGGTGGAC CGCTTGCCCA TGGTGGCTGG 1380 

ACCCCTATCC CCCCAACTGC TGACTTCCCC ATTCCCCAGT GTGGCATCCA GTGCCCCTCC 1440 

CCTGACTGGC AAGCGAGGCC GGGGCCGCCC AAGGAAGGCC AACCTGCTGG ACTCAATGTT 1500 

TGGGTCCCCA GGGGGCCTGA GGGAGGCAGG CATCCTTCCA TGCGGTCTAT GTGGTAAGGT 1560 

GTTCACTGAT GCCAACCGGC TCCGGCAGCA CGAGGCCCAG CACGGTGTCA CCAGCCTCCA 1620 

GCTGGGCTAC ATCGACCTTC CTCCTCCGAG GCTGGGTGAG AATGGGCTAC CCATCTCTGA 1680 

AGACCCCGAC GGCCCCCGAA AGAGGAGCCG GACCAGGAAG CAGGTGGCTT GTGAGATCTG 1740 

CGGCAAGATC TTCCGTGATG TGTATCATCT TAACCGGCAC AAGCTGTCCC ACTCTGGGGA 1800 

GAAGCCCTAC TCCTGCCCTG TGTGTGGGTT GCGGTTCAAG AGAAAAGACC GCATGTCCTA 1860 

CCATGTGCGG TCCCATGATG GGTCCGTGGG CAAGCCTTAC ATCTGCCAGA GCTGTGGGAA 1920 

AGGCTTCTCC AGGCCTGATC ACTTGAACGG ACATATCAAG CAGGTGCACA CTTCTGAGCG 1980 

GCCTCACAAG TGTCAGACCT GCAATGCTTC TTTTGCCACC CGAGACCGTC TGCGCTCCCA 2040 

CCTGGCCTGT CATGAAGACA AGGTGCCCTG CCAGGTGTGT GGGAAGTACT TGCGGGCAGC 2100 

ATACATGGCA GACCACCTGA AGAAGCACAG CGAGGGGCCC AGCAACTTCT GCAGTATCTG 2160 

TAACCGAGGT TTCTCCTCTG CCTCCTACTT AAAGGTCCAT GTTAAAACCC ACCACGGTGT 2220 

TCCCCTTCCC CAGGTCTCCA GGCACCAGGA GCCCATCCTG AATGGGGGAG CAGCGTTCCA 2280 

CTGCGCCAGG ACCTATGGCA ACAAAGAAGG CCAGAAATGC TCACATCAGG ATCCGATTGA 2340 

GAGCTCTGAC TCCTATGGTG ACCTCTCAGA TGCCAGCGAC CTGAAGACGC CAGAGAAGCA 2400 

GAGTGCCAAT GGCTCTTTCT CCTGCGACAT GGCAGTCCCC AAAAACAAAA TGGAGTCTGA 2460 

TGGGGAGAAG AAGTACCCAT GCCCTGAATG TGGGAGCTTC TTCCGCTCTA AGTCCTACTT 2520 

GAACAAACAC ATCCAGAAGG TGCATGTCCG GGCTCTCGGG GGCCCCCTGG GGGACCTGGG 2580 

CCCTGCCCTT GGCTCACCTT TCTCTCCTCA GCAGAACATG TCTCTCCTCG AGTCCTTTGG 2640 

GTTTCAGATT GTTCAGTCGG CATTTGCGTC ATCTTTAGTA GATCCTGAGG TTGACCAGCA 2700 

GCCCATGGGG CCTGAAGGGA AATGAGGCAG CTGCTGTGTC CCCACGGAAA CAACCATCTG 2760 

GGGACTGCTG GGAAATGCTG TGAATGCGGA GGGAAGTGAT GTTTGGGTTC TGTAGCTGAG 2820 

AGATTTTTAT TCATTTTTAA CTGCCCCCCA ACCCCACTCC AACTCCTTCT CCACCACCCA 2880 

TTCTCCCAAT GGTCTTTAGA AATAGATTTT CATCTGATAT TCTGCAGAAA TATCAATGAG 2940 

ACTTGGTATG GGACAGGGGC AGAAAACACT ACATAGGCCT CCAAGGCAAA ACCAGTCCCA 3000 

GTTTCTTTAA TGGGAAGAAG CTGGAATTCC TGGTGCTCAA TTCTTAGTGA CCCCAATCCT 3060 

ATACCCAAAT CTATGATATT CTGGGACCTC AGTGATTTTG GTCCCCTCCC ACTTCTCTAG 3120 

TTCGTCATCC TCCCTTCCCA . TATCCTTCAA AAGAACCACA CTAGGGTCTC CACCTACTTA 3180 

TACAATGCGG ATGCCCAACT GTTTTTAAGG AAGCCAGAAG CATCCCATGG ACCATGGGGT 3240 

GAGTGTCCTC CAAGAGCCCC CTGAGCTCAG CCCTCTGCCT GGAGGGCTCC AGACCTTTCT 3300 

GAGCCCTGCT TGGAGGCGAG CATTTTCACT GCTAGGACAA GCTCAGCTGT TGAGGACACC 3360 

CCCACCCCAA ATTTCAGTTC TTACGTGATT TTAACCATTC AACATGCTGT TGGGTTTTAA 3420 

TTCTCTAATT ATTATTATTA TTGTTATTAT TTTTTAGGAC CAGTTGTAGT GAATTGCTAC 3480 

TGAAAGCTAT CCCAGGTGAT ACAGAGCTCT TTGTAAACCG CAGTCACACA TTAGGGTTAG 3540 

TATTAAACTT TGTTTAGATG TACCATAATT AACTTGGCTA GTTGATTGTT TGAAGTCTAT 3600 

GGAAGAAATA GTTTTATGCA AAATTTTAAA AAATGCCAGT CTGGTCAGGG AAGTAGGGGG 3660 

TTTCAATGCT GTTGGGAACC AGGAAGGTGG GACAGCCGGC AGGTAGGGAC ATTGTGTACC 3720 

TCAGTTGTGT CACATGTGAG CAAGCCCAGG TTGACCTTGT GATGTGAATT GATCTGATCA 3780 
GACTGTATTA AAAATGTTAG TACATTACTC TA 



SEQ tD NO:265 PBY7 Protein sequence: 
Protein Accession #: NP_1 14439 

MER VNDASCG PSGCVTVQVS RHSTEMLHNL NQQRKNGGRF CDVLLRVGDE SFPAHRAVLA 60 
ACSEYFESVF SAQLGDGGAA DGGPADVGGA TAAPGGGAGG SRELEMHT1S SKVFGDILDF 120 
AYTSRIWRL ESFPELMTAA KFLLMRSVIE ICQEVIKQSN VQILVPPARA DIMLFRPPGT 1 80 
SDLGFPLDMT NGAALAANSN GIAGSMQPEE EAARAAGAAI AGQASLPVLP GVDRLPMVAG 240 
PLSPQLLTSP FPSVASSAPP LTGKRGRGRP RKANLLDSMF GSPGGLREAG ILPCGLCGKV 300 
FTDANRLRQH EAQHGVTSLQ LGYIDLPPPR LGENGLPISE DPDGPRKRSR TRKQVACBIC 360 
GKIFRDVYHL NRHKLSHSGE KPYSCPVCGL RFKRKDRMSY HVRSHDGSVG KPYICQSCGK 420 
GFSRPDHLNG HIKQVHTSER PHKCQTCNAS FATRDRLRSH LACHEDKVPC QVCGKYURAA 480 
YMADHLKKHS EGPSNFCSIC NREGQKCSHQ DPIESSDSYG DLSDASDLKT PEKQSANGSF 540 
SCDMAVPKNK MESDGEKKYP CPECGSFFRS KS YLNKfflQK VHVRALGGPL GDLGPALGSP 600 
FSPQQNMSLL ESFGPQIVQS AFASSLVDPE VDQQPMGPEG K 



SEQ ID NO:266 PBY9 DNA sequence 
Nucleic Acid Accession*: NM.012429 

Coding sequence: 174-1385 (undefined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

CCCTACTCCG CCTCTCGGGA TCCTTTAAGA GGCGGGGCTT GGCTGCCAGC TCCGCGGCCC 60 

GGGCAAAAGG CTGGGACTTT ACTCCGGGTG GCGGCGAGGA CGAGTCTGTG CTCCATCAGC 120 



413 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TGCCGCACCC 
GCAGAGTCGG 
TCCAGGATGT 
GAGCCAGAAG 
GAAAGCAAAA 
ATCTGTCAGG 
TTGGACCTCT 
CCAAGATGCG 
GGAGGAAGGT 
TCTGGAAGCC 
CCGAAACACT 
ACCTCATCAA 
ATTGGAAGGA 
GCACCATGAC 
ACATCCCCAG 
AGATTTCCCG 
TCAGGTGGCA 
AGATGGGAGA 
ACTCCCACCT 
TGCGGTTTGA 
TCCTGCTTCC 
AATAACACCT 
CCTTGTAGCA 
CCTCAGGAGC 
TATCAAATAC 
CTGTAAACTG 
TGTACCACAG 
ACTTCAGGGA 
TCGCAATGAG 
TCCAAACATT 
GGCCTGAGTC 
GACTTTGGCA 
CTCAGAGCTT 
GGGAAATGAC 
GAATGCTAAA 
TCCTCCATGT 
GAGAGGGTGT 
CTGAGCAAGG 
GTTCAGGTGC 
GGCGGGGCCG 
CAGCCCTTAC 
ACATGGGAAG 
CCCGTCTGGG 
CGGACCGGAA 
TGGGTTTACA 



CTTCGACCTG 
GGACATTGAC 
GGGTATGTGT 
GGATGCCAAG 
GGAGTGTGAG 
GGAGACCATC 
TGCTGTGGAG 
GAAGCGTCTT 
ACCCTTCCTG 
GGTTTTACTG 
TGACCCTGAT 
GAAGTATTAT 
TGGCTCCTCC 
GTTTATGTCA 
GAGGCAGCGG 
GGTCCCTGAA 
CAACACCTAC 
AGACAAAGCC 
TCTCCTATAG 
GTCATTTTCX3 
TTTCATTTCA 
CTAAGGAGTC 
TGCCAACTTC 
GGTGGCAGCA 
AGTCAGCTGC 
GAGTAGCAGG 
TTAGCACTGA 
AGCACACATC 
ACTCCTGGGC 
CCTGGGACTT 
CCACAGGGAT 
AGCAGATCGT 
GAGCAACCCC 
TTGCCAGTCT 
TCTTACTAAG 
CGGTCGGCGT 
GCGTCTCGCA 
CCCAATCCCA 
GCGGCCCCAG 
AAGCTCATCT 
GGGGCCGAGG 
ACGCTGTTAG 



GCCCCCAAAC 
CCCAGGCAGA 
CTGCCGAATC 
CAGAAGTCGG 
AACATCATTA 
GGCTATGACC 
GGTCTGCTGT 

ACCATAATTT 
GCCTATGGAG 
TTTGTTGTTA 
AGTGAGGACA 
AAACATATCA 
GGAAACCCCA 
GTGCGAGACC 
CACCAAGTGG 
GATGGAGCGG 
GCAGGGGAGA 
GATGGGACCC 
AGCTTCATTC 
TCAGAAGAGA 
CAGGCCTGGC 
CACAACCCTG 
GTTAGGCAGA 
CCCAGGAGCT 
ACCTGTCCAG 
GGGAAAAAAA 
CGGGGAGAAA 
GTAGCTGGTT 
GGCTGGGGTA 
TTCCCACTCG 
CACACGGCCT 
CGGGTACCCA 
CGCAGCTGCA 
CCAGTGCCCT 
GAGACAAAAA 
GAGTGTCCCG 
CAGTCCCATC 
AGCCAGGCCT 
GACTAGGGGC 
CGAGCCCCGC 
ACCTGGCGGG 
TGCGAAGCTG 
CTGCACGGGC 
GAAAATTAAC 




GCGGGGGGCA 
GGGACTTAAC 
TAAGTTTAGA 
CAGAAACTCT 
CTCAGGAGGT 
CTCACTTTGA 
TTGTCAGTGA 
CAAAGCGCCA 
GGGTGGGAGT 
CACCAAGCAG 
GCAGCACTCG 
CCTGACTGAC 
CGTGCAGGGA 
GCCGCCCAAA 
CTCGAAACCA 
GCTTTAGAGA 
AGGCCCCGGC 
CAGGCTAGCG 
CATCCCGGCC 
AGTGCGCA 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



SEQ ID NO:267 P BY9 Protein sequence: 
Protein Accession #: NPJ03656 1 

MSGRVGDLSP RQKEALAKFR ENVQDVLPAX PNPDDYFLLR WLRARSFDLQ KSEAMLRKHV 60 
EFRKQKDIDN HSWQPPEVI QQYLSGGMCG YDLDGCFVWY DIIGPLDAKG LLFSASKQDL 120 
LRTKMRECEL LLQECAHQTT KLGRKVETIT HYDCEGLGL KHLWKPAVEA YGEFLCMFEE 180 
NYPETLKRLF WKAPKLFPV AYNUKPFLS EDTRKK1M VL GANWKEVLLK MSPDQVPVE 240 
YGGTMTDPDG NPKCKSKINY GGDIPRKYYV RDQVKQQYEH SVQISRGSSH QVEYEBLFPG 300 
CVLRWQFMSD GADVGFGIFL KTKMGERQRA GEMTEVLPNQ RYNSHLVPED GTLTCSDPGI 360 
YVLRFDNTYS HHAKKVNFT VEVLLPDKAS EEKMKQLG AG TPK 



SEQ ID NO:268 PB H8 DNA sequence 
Nucleic Acid Accession*: XM.009756 

Coding sequence: 301-1440 (underlined sequence corresponds to start and stop codon) 



l 
I 

GTGGGGACAG 
CTTGCTGCAG 
TATATCCGAG 
TATTTATGAA 
CCAGCCGCTG 
ATGAAATGTG 
CACTGCAGTG 
TGCTACCAGA 
GAGATCAAGC 
TTCCTGGATT 
ACCCTATACC 
CTGTTGGTGA 
TGGGTGTGGG 
TGCATCGTGA 
CTGGAGCAGG 



11 
I 

CCGAGCCGCG 
ACTTTGGATG 
ACCGCTTCTG 
TACATCCATC 
CACCACCACC 
TCTTGGCGAA 
GCTACTTGAA 
TTGTGGGGCT 
TGTACAGTAA 
CCAGGGTGAC 
ATCACGTGCA 
AGGGCCAGGT 
TGCAGAGCTA 
GTGTCAATTA 
TGTCCACTGC 



21 
I 

CCGGGCCCCT 
GATTTGTTTT 
TCCATTTAGG 
CTTCTGACCA 
TGCTCCAAGG 
AAGGAACGCG 
GATCAGGCAG 
GGTGGCCGTG 
CATGTTCATG 
CGAGGTGACG 
CGGCTGCGAC 
CACCACCAAG 
CGCCACCGTG 
TGTACTCACG 
CAAGTCCCAG 



31 
I 

GGACGGCGTC 
TGTGGTAGCA 
CTTATCCCAG 
CGATGAGATG 
TATGAGATAG 
GGCCTGACCT 
TATATGCTGG 
GGCCAGTCGC 
TTCAGGGCCA 
GGGTACGAGC 
GTGTTCCACC 
TACTACCGGC 
GTGCACAACA 
GAGATTGAAT 
GACTCCTGGA 



41 
I 

GCCAAGGAGC 
TCTGATGGCA 
GTGGAGCTCA 
ACCGCTGTCC 
AGAGGTCGTT 
GCAGCGGATA 
ACATGTCCCT 
TGCCACCCAG 
GCCTTGACCT 
CGCAGGACCT 
TCCGCTACGC 
TGCTGTCCAA 

ACAAGGAACT 
GGACCGCCTT 



51 
I 

TGGGATCGCA 
AAATCATGTA 
CGGGCAACAG 
TCACGGCCCA 
CTTTCTTCGA 
CAAGGTCATC 
GTACGACTCC 
TGCCATCACC 
GAAGCTGATA 
GATCGAGAAG 
ACACCACCTC 
GCGGGGCGGC 
CCGGCCCCAC 
TCAGCTGTCC 
GTCTACCTCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



414 
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CAAGAAACTA GGAAATTAGT GAAACCCAAA AATACCAAGA T6AAGACAAA GCTGAGAACA 960 

AACCCTTACC CCCCACAGCA ATACAGCTCG TTCCAAATGG ACAAACTGGA ATGCGGCCAG 1020 

CTCGGAAACT GGAGAGCCAG TCCCCCTGCA AGCGCTGCTG CTCCTCCAGA ACTGCAGCCC 1080 

CACTCAGAAA GCAGTGACCT TCTGTACACG CCATCCTACA GCCTGCCCTT CTCCTACCAT 1140 

TACGGACACT TCCCTCTGGA CTCTCACGTC TTCAGCAGCA AAAAGCCAAT GTTGCCGGCC 1200 

AAGTTCGGGC AGCCCCAAGG ATCCCCTTGT GAGGTGGCAC GCTTTTTCCT GAGCACACTG 1260 

CCAGCCAGCG GTGAATGCCA GTGGCATTAT GCCAACCCCC TAGTGCCTAG CAGCTCGTCT 1320 

CCAGCTAAAA ATCCTCCAGA GCCACCGGCG AACACTGCTA GGCACAGCCT GGTGCCAAGC 1380 

TACGAAGGCA AGCAGATGTC CTCTGCGGAG ATACCGCCAG CTCCCCAGGA CGCAGACTGA 1440 
CTCCTGTTTG CTCGCTGGAC CAAC 



$EQ ID Np:269 PPHB Efltt^D SSfflfi^ 
Protein Accession* NPJJQ5060 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SATTSQLDKA SDRLTTS YL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK IMYISETASV HLGLSQVELT 120 
GNSIYEYIHP SDHDEMTAVL TAHQPLHHHL LQEYEIERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK KQYMLDMSL YDSCYQIVGL VAVGQSLPPS AITEIKLYSN MFMFRASLDL 240 
KLIFLDSRVT EVTGYEPQDL EKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQS Y ATWHNSRSS RPHCIVSVNY VLTEIEYKEL QLSLEQVSTA KSQDS WRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSPQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPS YSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPS YEAPAAA VRRFGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VIITNGR 



$EQ 10 NQ^BflDNA sequence; 
Nucleic Acid Accessioo#: AA760894 

GGCACGAGGA GAAGATGTGG CTTGCTCATG CTTGACTTCT GCCATGGTTG TGAGGCCTCC 60 
CCAGCCATGT GGAACTGTTT TCAGGTGCTG GTTCCATGGC TCTTCCTGAG CCGAAAATAA 120 
GG AAACTCCA TAGACCTTGT CCACTGGAAC TCGTTCCCAT CTACCCTCCA CTCTATCCAG 180 
GGTG ATGG AT CTCTGCAGTA AGTGG AAG AG TTCTTCATGG CCCCCAAGGT TATATCCATC 240 
TAGAACTTCA GCACGTAATT TCATCTGGAA ATAGTGCCTT TGTGGATATA AGTTAGGTAA 300 
AACTGAAGAT GAGATCATAC TGG ATTAGGA TGGGATCTAA ATCCAATGAA AATXjTCTTCA 360 
TAAAAAACAG GAAAGAACCC ATAGAAACAC AAGGAAGAAG GTCATGTGAA GATGGAGGCA 420 
GAGATTGGAG GGATGCAGCC ACCGGCCCAG GAATGCCAGC AGCCACCCAG AAGCTGGAAG 480 
GAAATG AGGG ATTCTCTCCT AGAACCTTTA G AG AGRACAT GGTCCTGTGA ACAGCTTG AT 540 
TTTGG ACTTG CCCATAGCTT GTATACTCTT ACTTTGGATA CAATTTTATC CAAACTTGGC 600 
TAAACAGTTT CTCAGCCTAT GGAAAATTTA AAATGG AG AA GATTCAACTC GATTCTTACA 660 
GATTCAAAGC AAGAAAATGA TGGGAACATA GGAGGAGACC AAGAAAGCCT ATAAAAAGCA 720 
AAAATATGAA GTGAACATTG TGGTAGCTTT AAGATGTTTA GTGTAGCTGC AGGCACCCTA 780 
TACACATGAA AACCCCCAAG GGGAATCCCC ATATCACAGT GTAGTGTGAT ATTTGACATT 840 
YGTGATCATY TAG AGATGTA CAG AAAAGGT GAATCTGTGT TCTGTATATT CTGCCTAAGG 900 
CAAAGAAATG TTTAGCTYTC TTTAAAATAG TTCCATAATT TTTTYTAAAA AGCTTTGCTT 960 
G AAAACTGTA AGCTTCCCAT ATCTGGAGCA TTTCACTTTA AATATTTGGA TAAATATGTT 1020 
ATCTTCTTAC TTGG ACATTT CATGTGTTTA GGGATTGTYT TYTAAATTCT TCCTAATTCA 1080 
TATAGCTGCT AACACTTCCC GCAGAGCTAA ACCATTACAG ANTATGAAAT AAAGACCCTA 1140 
TTGATTTGAA CTTAAAAAAA AAAAMAMAAA AAAAAAAAAA AAAAAAAAAT G A 

?EQ1DN0;271 PgQ4gNA sequence 
Nucleic Add Accession* AA149579 

Coding sequence: 1-1 363 (underlined sequence corresponds to start and stop codon) 



l 11 21 31 41 51 

I I I I I I 

ATGGAATCAA TCTCTATGAT GGGAAGCCCT AAGAGCCTTA GTGAAACTTG TTTACCTAAT 60 

GGCATAAATG GTATCAAAGA TGCAAGGAAG GTCACTGTAG GTGTGATTGG AAGTGGAGAT 120 

TTTGCCAAAT CCTTGACCAT TCGACTTATT AGATGCGGCT ATCATGTGGT CATAGGAAGT 180 

AGAAATCCTA AGTTTGCTTC TGAATTTTTT CCTCATGTGG TAGATGTCAC TCATCATGAA 240 

GATGCTCTCA CAAAAACAAA TATAATATTT GTTGCTATAC ACAGAGAACA TTATACCTCC 300 

CTGTGGGACC TGAGACATCT GCTTGTGGGT AAAATCCTGA TTGATGTGAG CAATAACATG 360 

AGGATAAACC AGTACCCAGA ATCCAATGCT GAATATTTGG CTTCATTATT CCCAGATTCT 420 

TTGATTGTCA AAGGATTTAA TGTTGTCTCA GCTTGGGCAC TTCAGTTAGG ACCTAAGGAT 480 

GCCAGCCGGC AGGTTTATAT ATGCAGCAAC AATATTCAAG CGCGACAACA GGTTATTGAA 540 

CTTGCCCGCC AGTTGAATTT CATTCCCATT GACTTGGGAT CCTTATCATC AGCCAGAGAG 600 

ATTGAAAATT TACCCCTACG ACTCTTTACT CTCTGGAGAG GGCCAGTGGT GGTAGCTATA 660 

AGCTTGGCCA CATTTTTTTT CCTTTATTCC TTTGTCAGAG ATGTGATTCA TCCATATGCT 720 

AGAAACCAAC AGAGTGACTT TTACAAAATT CCTATAGAGA TTGTGAATAA AACCTTACCT 780 

ATAGTTGCCA TTACTTTGCT CTCCCTAGTA TACCTCGCAG GTCTTCTGGC AGCTGCTTAT 840 

CAACTTTATT ACGGCACCAA GTATAGGAGA TTTCCACCTT GGTTGGAAAC CTGGTTACAG 900 

TGTAGAAAAC AGCTTGGATT ACTAAGTTTT TTCTTCGCTA TGGTCCATGT TGCCTACAGC 960 

CTCTGCTTAC CGATGAGAAG GTCAGAGAGA TATTTGTTTC TCAACATGGC TTATCAGCAG 1020 

GTTCATGCAA ATATTGAAAA CTCTTGGAAT GAGGAAGAAG TTTGGAGAAT TGAAATGTAT 1080 

ATCTCCTTTG GCATAATGAG CCTTGGCTTA CTTTCCCTCC TGGCAGTCAC TTCTATCCCT 1140 

TCAGTGAGCA ATGCTTTAAA CTGGAGAGAA TTCAGTTTTA TTCAGTCTAC ACTTGGATAT 1200 



415 
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5 

10 
15 
20 



GTOGCTCTGC TCATAAGTAC TTTCCATGTT TTAATTTATG GATGGAAACG AGCTTTTGAG 
GAAGAGTACT ACAGATTTTA TACACCACCA AACTTTGTTC TTGCTCTTGT TTTGCCCTCA 
ATTGTAATTC TGGATCTTTT GCAGCTTTGC AGATACCCAG ACTGA 

S5Q IP NQfflg PBQ4 Prolan sequence; 
Protein Accession #: none 



l 11 
I I 

HESISKMGSP KSLSETCLPN 
RNPKFASEFF PHWDVTHHE 
RINQYPESNA EYLASLFPDS 
LARQLNFIPI DLGSLSSARE 
RNQQSDFYKI PIEIVNKTLP 
CRKQLGLLSF FFAMVHVAYS 
ISFGIMSLGL LSLLAVTSIP 
EEYYRFYTPP NFVLALVLPS 



21 



31 



GINGIKDARK VTVGVIGSGD 
DALTKTOIIF VAIHREHYTS 
LIVKGFNWS AWALQLGPKD 
IENLPLRLFT LWRGPVWAI 
IVAITLLSLV YIAGLLAAAY 
LCLPMRRSER YLFLNMAYQQ 
SVSNALNWRB FSFIQSTLGY 
IVILDLLQLC RYPD 



41 

I 

FAKSLTIRLI 
LWDLRHLLVG 
ASRQVYICSN 
SLATFFFLYS 
QLYYGTKYRR 
VHANIENSWN 
VALLISTFHV 



51 
I 

RCGYHWIGS 
KILIDVSNNM 
NICARQQVIE 
FVRDVIHPYA 
FPPWLETWLQ 
EEEVWRIEMY 
LIYGWKRAFE 



1260 
1320 



60 
120 
180 
240 
300 
360 
420 



SEQ ID NO:273 PBQ5 DNA SEQUENCE 

Nucleic Acid Accession*: NM.001973 

Coding sequence: 150-1445 (underlined sequence corresponds to start and stop codon) 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



i 
I 

CCGCCGCCTT 
AGCGTGAGGA 
GAGCCCCGCG 
TTCTTCAGCT 
GGCAGTTTAA 
AGCCTAACAT 
TCATCAAAAA 
TGAACATGGA 
GTGAAGTCAG 
CTGGTGCCAA 
CTCTCAACTC 
CAGCCGAGAA 
TTGTCACGAC 
GCCCAAGTAT 
CAAAACTGCC 
CCACACCACC 
CACTGAGTTC 
AACTTCCAGA 
ACAAAGTAAA 
TTGTGATCAC 
CTTCTCTTAC 
TCTCCAGTAT 
TGCAAGGTGC 
CTCTGTCTGG 
CATAACCTAT 
GATTGCATTT 
TTTGCCATTC 
ACTATATGTA 
TTTCTTTTTC 
CTGAAGAAGT 
TTACTCCTTC 
TTAAAGAAGT 
AAAAAAAAAA 



11 
I 

CTACTCCGCC 
GGAGGCTGAG 
CGCGGCGTCG 
CCTGCAGAAG 
GCTTTTGCAG 
GAATTATGAC 
AGTGAATGGT 
TCCAATGACA 
CAGCAGTTCC 
GACCTCTAGC 
TTTGAACTCC 
ACTGGCAGAG 
ACCTTCCAAA 
TTCTCCATCT 
TTCCCTGGAA 
CATTTCGTCC 
TCACCCAGAC 
GAATTTGTCT 
TAATTCATCA 
GAGCAGTGAT 
ACCAGCATTT 
CCACTTCTGG 
TAACACACTT 
GCTGGATGGA 
GCACTTGTGG 
GAAGTGAGCA 
CCCATTGAAA 
TAAAAATGCC 
TTTCCTTCCT 

TGGCTATTGG 
ATTTGTGAAA 



21 
I 

GCGGGGGTCG 
GGCGGAGAGG 
CTCATTGCTA 
CCTCAGAACA 
GCAGAAGAGG 
AAACTCAGCC 
CAGAAGTTTG 
GTGGGCAGGA 
AAAGATGTGG 
CGCAATGACT 
TCCAATGTAA 
AAAAAATCTC 
AAGCCACCAG 
TCAGAAGAAA 
GCCCCAACCT 
ATACCCCCTT 
ATCGACACAG 
CTGGAGCCTA 
AGATCCAAGA 
CCAAGCCCAC 
TTTTCACAGA 
AGTACTCTCA 
TTCCAGTTTC 
CCTTCCACCC 
AATGAGAGAA 
ATTGATAGTT 
ACATCTTTTT 
TTAATTGGAG 
TCCTTTTCTT 
CTTTAGTGAC 
GACCCTTTGG 
TGAAAAAAAA 



31 
I 

CAGCGGCTGC 
CGCATCGTGT 
TGGACAGTGC 
AGCACATGAT 
TGGCTCGTCT 
GAGCCCTCAG 
TGTACAAGTT 
TTGAGGGTGA 
AGAATGGAGG 
ACATACACTC 
AGCTTTTCAA 
CTCAGGAGCC 
TTGAACCTGT 
CTATCCAAGC 
CTGCCTCTAA 
TGCAGGAACC 
ACATTGATTC 
AAGACCAGGA 
AACCCAAAGG 
TGGGAATACT 
CACCCATCAT 
GTCCTGTTGC 
CTTCTGTACT 
CTGGCCCATT 
CCGAGGAACG 
CTACAATGCT 
AGGATTCTCT 
TCTAAACTCC 
TTCTCCTTTA 

CCAGGAAAAA 
AAAAAAAAAA 



41 
I 

CGCGCCGTCC 
TCGAGGCGGA 
TATCACCCTG 
CTGTTGGACC 
CTGGGGGATT 
ATACTATTAT 
TGTCTCTTAT 
CTGTGAAAGT 
GAAAGATAAA 
TGGCTTATAT 
ATTGATAAAG 
CACACCATCT 
TGCTGCCACC 
TTTGGAGACA 
CGTAATGACT 
TCCCAGAACA 
AGTGGCTTCT 
TTCAGTCTTG 
GTTAGGACTG 
GAGCCCATCT 
ACTGACTCCA 
TCCCCTAAGT 
GAACAGTCAT 
TTCCCCAGAC 
AAGAAACAGA 
GATAATAGAC 
TTGAATAGGA 
ACCTCCCTCT 
AAAATATTTT 
AAAAGCAATT 
TTATGCTTAG 
AAAAAAAAAA 



51 
I 

TCGAGTTTCC 
GACCGAGGGG 
TGGCAGTTCC 
TCTAATGATG 
CGCAAGAACA 
GTAAAGAATA 
CCAGAGATTT 
TTAAACTTCA 
CCACCTCAGC 
TCTTCATTTA 
ACTGAGAATC 
GTCATCAAAT 
ATTTCAATTG 




60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
'840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



?CT ID Np:274 PBQ 5 Pro feM se q uenc e; 
Protein Accession #: NP_001964 

MDSAJTLWQF ULQLLQKPQN KHMICWTSND GQFKLLQAEE V ARLWGIRKN KPNMNYDKLS 60 
RALRYYYVKN IIKKVNGQKF VYKFVSYPEI LNMDPMTVGR IEGDCESLNF SEVSSSSKDV 120 
ENGGKDKPPQ PGAKTS5RND YIHSGLYSSF TLNSLNSSNV KLFKLUCTEN PAEKLAEKKS 180 
PQEPTPSVIK FVTTPSKKPP VEPVAATISI GPSISPSSEE TIQALETLVS PKLPS LEAPT 240 
SASNVMTAFA TTPPISSIPP LQEPPRTPSP PLSSHPDIDT DIDS VA5QPM ELPENLSLEP 300 
KDQDSVIXEK DKVNNSSRSK KPKGLGLAPT LVITSSDPSP LGILSPSLPT ASLTPAFFSQ 360 
TPULTPSPL LSSIHFWSTL SPVAPLSPAR LQGANTLFQF PSVLNSHGPF TLSGLDGPST 420 
PGPFSPDLQKT 



SE0 ID NO:275 PBY3 DNA SEQUENCE 

75 Nucleic Acid Accession*: AB040921 

Cooing sequence: 131-2560 (underlined sequence corresponds to start and stop codon) 



11 
I 



21 
I 



31 



41 



51 



416 
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AATCAGGAAC AGATCATATA TTGACCGAGA TTCTGAGTAT CTCTTGCAAG AAAATGAACC 60 

AGATGGAACT TTAGACCAAA AATTATTGGA AGATTTACAA AAGAAAAAAA ATGACCTTCG 120 

GTATATTGAA ATGCAGCATT TCAGAGAAAA GCTGCCTTCG TATGGAATGC AAAAGGAATT 180 

GGTAAATTTA ATTGATAACC ATCAGGTAAC AGTAATAAGT GGTGAAACTG GTTGTGGCAA 240 

AACCACTCAA GTTACTCAGT TCATTTTGGA TAACTACATT GAAAGAGGAA AAGGATCTGC 300 

TTGCAGAATA GTTTGTACTC AGCCAAGAAG AATTAGTGCC ATTTCAGTTG CGGAAAGAGT 360 

AGCTGCAGAA AGGGCAGAAT CTTGTGGCAG TGGTAATAGT ACTGGATATC AAATTCGTCT 420 

CCAGAGTCGG TTGCCAAGGA AACAGGGTTC TATCTTATAC TGTACAACAG GAATCATCCT 480 

TCAGTGGCTC CAGTCAGACC CGTATTTGTC CAGTGTTAGT CATATCGTAC TTGATGAAAT 540 

CCATGAAAGA AATCTGCAGT CAGATGTTTT AATGACTGTT GTTAAAGACC TTCTCAATTT 600 

TCGATCTGAC TTGAAAGTAA TATTGATGAG TGCAACATTG AATGCAGAAA AGTTTTCAGA 660 

ATATTTTGGT AACTGTCCAA TGATACATAT ACCTGGTTTT ACCTTTCCGG TTGTGGAATA 720 

TCTTTTGGAA GATGTAATTG AAAAAATAAG GTATGTTCCA GAACAAAAAG AACACAGATC 780 

CCAGTTTAAG AGGGGTTTCA TGCAAGGGCA TGTAAATAGA CAAGAAAAAG AAGAAAAAGA 840 

AGCAATATAT AAAGAACGTT GGCCAGATTA TGTAAGGGAA CTGCGAAGAA GGTATFCTGC 900 

AAGTACTGTA GATGTTATAG AAATGATGGA GGATGATAAA GTTGATCTGA ATTTGATTGT 960 

TGCCCTCATC CGATACATTG TTTTGGAAGA AGAGGATGGT GCGATACTGG TCTTTCTGCC 1020 

AGGCTGGGAC AATATCAGCA CTTTACATGA TCTCTTGATG TCACAAGTAA TGTTTAAATC 1080 

AGATAAATTT TTAATTATAC CTTTACATTC ACTGATGCCT ACAGTTAACC AGACACAGGT 1140 

GTTTAAAAGA ACCCCTCCTG GTGTTCGGAA AATAGTAATT GCTACCAACA TTGCGGAGAC 1200 

TAGCATTACC ATAGATGATG TCGTTTATGT GATAGATGGA GGAAAAATAA AAGAGACGCA 1260 

TTTTGATACT CAGAACAATA TCAGTACAAT GTCCGCTGAG TGGGTTAGTA AAGCTAATGC 1320 

CAAACAGAGA AAAGGTCGAG CTGGAAGAGT TCAACCTGGT CATTGCTATC ATCTGTATAA 1380 

TGGTCTTAGA GCAAGTCTTC TAGATGACTA TCAACTGCCA GAAATTTTGA GAACTCCTTT 1440 

GGAAGAACTT TGTTTACAAA TAAAGATTTT AAGGCTAGGT GGAATTGCTT ATTTTCTGAG 1500 

TAGATTAATG GACCCACCAT CAAATGAGGC AGTGTTACTC TCCATAAGAC ACCTGATGGA 1560 

GCTGAACGCT TTGGATAAAC AAGAAGAATT GACACCTCTT GGAGTCCACT TGGCACGATT 1620 

ACCCGTTGAG CCACATATTG GAAAAATGAT TCTTTTTGGA GCACTGTTCT GCTGCTTAGA 1680 

CCCAGTACTC ACTATTGCTG CTAGTCTCAG TTTCAAAGAT CCATTTGTCA TTCCACTGGG 1740 

AAAAGAAAAG ATTGCAGATG CAAGAAGAAA GGAATTGGCA AAGGATACTA GAAGTGATCA 1800 

CTTAACAGTT GTGAATGCGT TTGAGGGCTG GGAAGAGGCT AGGCGACGTG GTTTCAGATA 1860 

CGAAAAGGAC TATTGCTGGG AATATTTTCT GTCTTCAAAC ACACTGCAGA TGCTGCATAA 1920 

CATGAAAGGA CAGTTTGCTG AGCATCTTCT TGGAGCTGGA TTTGTAAGCA GTAGAAATCC 1980 

TAAAGATCCA GAATCTAATA TAAATTCAGA TAATGAGAAG ATAATTAAAG CTGTCATCTG 2040 

TGCTGGTTTA TATCCCAAAG TTGCTAAAAT TCGACTAAAT TTGGGTAAAA AAAGAAAAAT 2100 

GGTAAAAGTT TACACAAAAA CCGATGGCCT GGTTGCTGTT CATCCTAAAT CTGTTAATGT 2160 

GGAGCAAACA GACTTTCACT ACAACTGGCT TATCTATCAC CTAAAGATGA GAACAAGCAG 2220 

TATATACTTG TATGACTGCA CAGAGGTTTC CCCATACTGT CTCTTGTTTT TTGGAGGTGA 2280 

CATTTCCATC CAGAAGGATA ACGATCAGGA AACTATTGCT GTAGATGAGT GGATTGTATT 2340 

TCAGTCTCCA GCAAGAATTG CCCATCTTGT TAAGGAATTA AGAAAGGAAC TAGATATTCT 2400 

TCTGCAAGAG AAGATTGAAA GTCCTCATCC TGTAGACTGG AATGACACTA AATCCAGAGA 2460 

CTGTGCAGTA CTGTCAGCTA TTATAGACTT GATCAAAACA CAGGAAAAGG CAACTCCCAG 2520 

GAACTTTCCG CCACGATTCC AGGATGGATA TTACAGCTGA CAGCTTTTCA GGGGTGGTCT 2580 

GAAAAGCCAG TTTGACAGCC ATTCTTCATC ATTGTTTAAA TTTTGGCTGG ATGCCAAACC 2640 

CTGGGACATG AACAATTTTC ATGTGTAAGG TAGAAGCCTT CAGTAGGTAG TAAAGACTTA 2700 

ATGTGCATGA CTTGATGTTA TATGTAGAGA TATATATATA TATATATATA CCATAAAAGC 2760 

AATATGTTCT CTGATCATAT ACTCTGCTGT GGTCATGCCC ACTCTTTGGG AGTATATTCC 2820 

CTTTATATAT ATTGAGTATT GTACCACTTG AGAAATTCCT TTGTTCTGTT ATACAAAATT 2880 

AATCTTTCTG CTCATAATGA TTGATGATAC CACCAGTAAA AATAGGATGT TTACCCCAAA 2940 

ACAAGTGTCA ATTAAGAATT TGAACACAAC CACATTTTTT AAAATGAAAC TTCTATCGGA 3000 
AGTAAATTAA TTTGTTGTAA TAAAGTCCAG TATTTAATAA AATGTACAAT GTTAAATCTC 

SEQ IP KOCT PPY3 Prcten sgqugpge; 

Protein Accession*: BAA96012 

IRNRSYIDRD SEYLLQENEP EXJTLDQKLLE DLQKKKNDLR YIEMQHFREK LPSYGMQKEL 60 
VNUDNHQVT VISGETGCGK TTQVTQFILD NYIERGKGSA CRIVCTQPRR ISAISVAERV 120 
AAERAESCGS GNSTGYQIRL QSRLPRKQGS ILYCTTGIIL QWLQSDPYLS SVSHIVLDH 180 
HERNLQSDVL MTWKDLLNF RSDUCVILMS ATLNAEKFSE YFGNCPMIHI PGFTFPWEY 240 
LLEDVTEKIR YVPEQKEHRS QFKRGFMQGH VNRQEKEEKE AIYKERWPDY VRELRRRYSA 300 
STVDVIEMME DDKVDLNLIV AURYIVLEE EDGAILVFUP GWDNISTLHD LLMSQVMFKS 360 
DKFLIIPLHS LMPTVNQTQV FKRTPPGVRK IVIATNIABT SITIDDWYV IDGGKIKETH 420 
FDTQNNISTM SAEWVSKANA KQRKGRAGRV QPGHCYHLYN GLRASLLDDY QLPOLRTPL 480 
EELCLQIKIL RIXJGIAYFLS RLMDPPSNEA VLLSIRHLME LNALDKQEEL TPLGVHLARL 540 
PVEPHIGKMI LFGALFCCLD PVLTIAASLS FKDPFV1PLG KEKIADARRK ELAKDTRSDH 600 
LTWNAFEGW EEARRRGFRY EKDYCWEYFL SSNTLQMLHN MKGQFAEHLL GAGFVSSRNP 660 
KDPESNINSD NEKIIKAVIC AGLYPKVAKI RLNLGKKRKM VKVYTKTDGL VAVHPKS VNV 720 
EQTDFHYNWL rYHLKMRTSS IYLYDCTEVS PYCLLFFGGD ISIQKDNDQE TIAVDEWPVF 780 
QSPARIAHLV KELRKELDIL LQEKIESPHP VDWNDTKSRD CAVLS AIIDL KTQEKATPR 840 
NFPPRFQDGY YS 



SEQ ID N0:277 PBY6 DNA SEQUENCE 

Nucleic Add Accession*: AA464018 

Coding sequence: 64-1669(underiined sequence corresponds to start and stop codon) 



GATTTTATOC TGGAACATTA CAGTGAAGAT GGCTATTTAT ATGAAGATGA AATTGCAGAT 60 
CTTAJQGATC TGAGACAAGC TTGTCGGACG CCTAGCCGGG ATGAGGCCGG GGTGGAACTG 120 
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CTGATGACAT ACTTCATCCA GCTGGGCTTT GTCGAGAGTC GATTCTTCCC G<XCACACGG 180 
CAGATGGG AC TCCTGTTCAC CTGGTATGAC TCTCTCACCG GGGTTCCGGT CAGCCAGCAG 240 
AACCTGCTGC TGGAGAAGGC CAGTGTCCTG TTCAACACTG GGGCCCTCTA CACCCAGATT 300 
GGGACCCGGT GTGATCGGCA GACGCAGGCT GGGCTGGAGA GTGCCATAGA TGCCTTTCAG 360 
AGAGCCGCAG GGGTTTTAAA TTACCTGAAA GACACATTTA CCCATACTCC AAGTTACGAC 420 
ATGAGCCCTG OCATGCTCAG CGTGCTCGTC AAAATG ATGC TTGCACAAGC CCAAGAAAGC 480 
GTGTTTG AGA AAATCAGCCT TCCTGGG ATC CGGAATGAAT TCTTCATGCT GGTGAAGGTG 540 
GCTCAGGAGG CTGCTAAGGT GGG AGAGGTC TACCAACAGC TACACGCAGC CATG AGCCAG 600 
GCGCCGGTGA AAGAGAACAT CCCCTACTCC TGGGCCAGCT TAGCCTGCGT GAAGGCCCAC 660 
CACTACGCGG CCCTGGCCCA CTACTTCACT GCCATCCTCC TCATCGACCA CCAGGTGAAG 720 
CCAGGCACGG ATCTGGACCA CCAGGAGAAG TGCCTGTCCC AGCTCTACGA CCACATGCCA 780 
GAGGGGCTGA CACCCITGGC CACACTGAAG AATGATCAGC AGCGCCGACA GCTGGGGAAG 840 
TCCCACTTGC GCAGAGCCAT GGCTCATCAC GAGGAGTCGG TGCGGG AGGC CAGCCTCTGC 900 
AAGAAGCTGC GGAGCATTGA GGTGCTACAG AAGGTGCTGT GTGCCGCACA GGAACGCTCC 960 
CGGCTCACGT ACGCCCAGCA CCAGGAGGAG G ATGACCTGC TGAACCTGAT CGACGCCCCC 1020 
AGTGTTGTTG CTAAAACTG A GCAAG AGGTT G ACATTATAT TGCCCCAGTT CTCCAAGCTG 1080 
ACAGTCACGG ACTTCTTCCA GAAGCTGGGC CCCTTATCTG TGTTTTCGGC TAACAAGCGG 1140 
TGGACGOCTC CTOG AAGCAT CGGCTTCACT GCAGAAG AAG GGGACTTGGG GTTCACCTTG 1200 
AGAGGGAACG CCCCCGTTCA GGTTCACTTC CTGGATCCTT ACTGCTCTGC CTCGGTGGCA 1260 
GGAGCCCGGG AAGGAGATTA TATTGTCTCC ATTCAGCTTG TGGA1TGTAA GTGGCTGACG 1320 
CTGAGTGAGG TTATGAAGCT GCTGAAGAGC TTTGGCGAGG ACGAGATCGA GATGAAAGTC 1380 
GTGAGCCTCC TGGACTCCAC ATCATCCATG CATAATAAGA GTGCCACATA CTCCGTGGGA 1440 
ATGCAGAAAA CGTACTCCAT GATCTGCTTA GCCATTGATG ATGACGACAA AACTGATAAA 1500 
ACCAAGAAAA TCTCCAAGAA GCTTTCCYTC CTGAGTTGGG GCACCAACAA GAACAGACAG 1560 
AAGTCAGCCA GCACCTTGTG CCTCCCATCG GTCGGGGCTG CACGGCCTCA GGTCAAGAAG 1620 
AAGCTGOCCT CCCCTTTCAG CCTTCTCAAC TCAG ACAGTT CTTGGTACIA_A 



SEP ID NO:278 PBY6 Protein sequence: 
Protein Accession #: NPJ49094 

DFILEHYSED GYLYEDEIAD LMDLRQACRT PSRDEAGVEL LMTYFIQLGF VESRFFFPTR 60 
QMGLLFTWYD SLTGVPVSQQ NLLLEKAS VL FNTGALYTQI GTRCDRQTQA GLESAJDAJFQ 120 
RAAGVLNYLK DTFTHTPSYD MSPAMLS VLV KMMLAQAQES VFEK1SLPGI RNEFFMLVKV 180 
AQEAAKVGEV YQQLHAAMSQ APVKENIPYS WASLACVKAH HYAALAHYFT AUXJDHQVK. 240 
PGTDLDHQEK CLSQLYDHMP EGLTPLATLK NDQQRRQLGK SHLRRAMAHH EES VREASLC 300 
KKLRSIEVLQ KVLCAAQERS RLTYAQHQEE DDLLNL1DAP S WAKTEQEV DDLPQFSKL 360 
TVTDFFQKLG PLSVFSANKR WTPPRSIRFT AEEGDLGFTL RGNAPVQVHFLDPYCSASVA 420 
GAREGDYIVS IQLVDCKWLT LSEVMKLLKS FGEDEIEMKV VSLLDSTSSM HNKSATYSVG 480 
MQKTYSMICL AIDDDDKTDK TKK1SKKLSF LSWGTNKNRQ KS ASTLCLPS VGAARPQVKK 540 
KLPSPFSLLN SDSSWY 

SEQ ID N0:279 PBY8 DNA SEQUENCE 

Nucleic Acid Accession*: AF107493 

Coding sequence: 125-556 (underlined sequence corresponds to start and stop codon) 



l 
I 

GAATTCGGCA 



GACAATGGGT 
CATAGACAGG 
CAAAAGATCT 
TCCAGAGAGA 
TGGTGACTAT 
CATCATGCTG 
GTCCTTCGAA 
CTTGCTTAGT 
ATCCTGATTC 
GGAATGTGAC 
GGGACTTTTT 
GTCGGTTGCA 
TTGTTCCTTG 
TGCTTCCCAA 
TAAAGGAACT 
CCGTGGTTTC 
CAATCAGGTT 
ATTTTAAAAA 
AAGCTTTGTA 
ATAGAATTTG 
AATACCTGTC 
TTCCTATTAT 
AGAACATAGT 
ACCACAAAGC 
TTAGTGGAAT 
TGTCTAATCT 
TCAAAGCAAG 
CAAGATAATG 



11 
I 

CGAGCCTTGT 
GAAAAAATAA 
TCAGACAAAA 
GATGACCGTG 
AGTGATGATC 
GAGCGTGAAA 
GGTGAGCACG 
CGCGGCCTTC 
GGCCCTCAGC 
TCCTGATATT 
CCCCAGTCTT 
TCTTTGAAAA 
TTTCCTTAGG 
TGGTTACTTT 
CCATGGCAAA 
TTGGCTTTGT 
GACTAACACA 
GCCTTCGTGG 
GCTTCACTCA 
AAGGTTGAAG 
TATATTAAAA 
TTCTGCCTTT 
TGGTAATCAC 
GTCCATTGAG 
GAAAGTCTGT 
CTCCTGGAGG 
TTATGAACAA 
AGAGTGGCAT 
TCATTGATGG 
TTCAGTGCTT 



21 

I 

TGGAGGTTCT 
AATTTGAACC 
GAGTGAGTAG 
ATGAGCGTGA 
GGAGGGGTGA 
GAAGGAACAG 
ACTATAGGCA 
CCATCACCAT 
CTGCGGATGT 
ATTGTTCTCT 
CAAGCACATG 
CCATGTTAGC 
TAAGTAATGA 
AAGCGTGGAA 
GTATCAAGAA 
CACGCAGTGT 
AGTATCCCGT 
AGTTTTATCA 
CCAAGTCTAG 
GAGTGGTTTG 
TTGATGTTAC 
AAACATGGCT 
TAAAACATCT 
AGTAAGCTTA 
GGCGGCATTT 
CGTAACTGCT 
TAAGTCTGAT 
TAACATTCTA 
TTCTTCGAGG 
GGCACTTAAA 



31 
I 

GGGGCGCAGA 
TTTTGGAGCT 
AACAGAGCGT 
AlCCCGAAGC 
TAGATATGAT 
TGACCGATCC 
TGACATCAGT 
CACAGAGAGC 
GAGGCTGATG 
TCCCCATTCC 
AATTCAGAAT 
ATCTGAGGAA 
TTTATAAACT 
TCAAATGGAG 
GATCCCCAAG 
TGAAGCAGTG 
CTATATCTGA 
CTTGCAAGAT 
ATATTCATGA 
TTCCAAAGGA 
TAGAATAAGT 
ACCTACCTGG 
TTATGTTTCC 
GTATATCAAA 
TTATAAGTAA 
CAGACCGGTC 
GAGATTAGCC 
ATCTCCTTGA 
TAGTGTTAAC 
TAACATTTTT 



41 
I 

ACCGCTACTG 
GTGTGCTAAA 
AGTGGAAGAT 
AGGCGGAGGG 
GACTACCGAG 
GAAGATGGCT 
GACGAGAGGG 
GATATTCGAG 
AAGAGGAAAA 
CACCTCAGTC 
GAAAGGTTTG 
CTTTTTTAAA 
CCTTTTTTTT 
TGGCATTTAG 
TCAAGTCACA 
GGAGAGAGAT 
ATGCTGTCTC 
GCTACCAGCT 
AAATGGAACA 
GTGACTTTTT 
ACAGTACCAA 
CAGGGCTTTG 
CTTTTTTCTA 
CTCTCCATTT 
TTCCTTATTT 
TTCAGGGAAT 
TGGSAGTGGT 
GAATGCCTTT 
TGAAGTGTTC 
TGCAAGAACT 



51 
I 

CTGCTTCGGT 
TCTTCAGTGG 
ACGGTTCCAT 
ACTCAGATTA 
ACTATGACAG 
ACCATTCAGA 
AGAGCAAGAC 
AAATGATGGA 
CAGGTGAGAG 
CCTAAAGAAC 
CCATGGCTAA 
CTTTGTTT^A 
TTTGACTATA 
TTCAGGCGGC 
TTTGTAAAGC 
TCACCTGTTA 
TAGGTGTAAG 
GGATGGAAGC 
AGTCTGTACA 
TTTAAAAAAA 
GGACTTCATT 
TTAACTACTG 
GTTTGTTATA 
GACAGTGAAG 
CTGCCTGAAG 
ATTTAAGGAC 
GTCCTGCAGC 
TATAGTCTGT 
TTCAGTTTGT 
CCAAGGCACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
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TTATTGAATG CCTTTAACCA AGTGCATTCT GGGAAGTTTG CTTGACTCAT TATCTTGCTT 1860 

TTCTGCAGCA TTCTGTGATT TGAGTCATCC ATGAATCCAT GAATAAAAGT TACATTCTTT 1920 

GATTGGTAAT ATTGCCATTT ATAACAAGAC TCACTAATGA GGGTATCACT TTGACTGACT 1980 

GATTTGTTAA AGTTTTTAAG CCTCTCATTT TCCTAACCCA GAAATCACAG CCTGATTTTA 2040 

TTAAAAGTAG AGCTTCATTC ATTTCATACC ATAGATACCA TCCTAGTAAA TCCAGAACAT 2100 

ATACAAGGTT CATGTGAGTC TGCTTTCTTG ACATGATAGC ATTGTTTGAT GCAGTGGATA 2160 

TGTCAGAATG ACTAACCTAG GAGTTTGAAA CTCCTAAGAA ACTAAAACCT GTAAGACATT 2220 

TAAAAGTCTC CACAATTTTA ATGTATACAA AGCTATGTTA CTGTGTAACA CATTACAGTT 2280 

CAAATTCACT CCAGAAATAA AAGGCCAGTA GGATTAGGGA CTCACTGGTA GTTTGGAGTC 2340 

TCCCAGCACA CATCCCTCCT AGTGGGATGA TCTATTCACA TATCTCCCAG CTTTTTTATT 2400 

TTTGCTTCTG TATATCACAG TGAGTGGATG GCCCTTCAGC TTTTTCTCTC CTGGCCAGAC 2460 

ATGCAGTCTT GCCTTTAGAT ATCGCAGAGA CAAAATTCAC AGCATGTCTT AAATCTTCCA 2520 

GGATTTGCAA GAACCAAATT GCTCAACAGT ATGTATGTTT AGAGGGGTTA GACTCCTTTT 2580 

TAAAATCTGG ATATCTAACC ACCTACTTAA ATCTGTTTGA TAGTGTCAAA CCACCCCCAC 2640 
CCTTGATCCT CCCACCCCCA AAAAAAAAAA AAAA 



$EQ ID NQ380 PPY9 prolan sequence; 
Protein Accession*: XPJJG3261 

MGSDKRVSRT ERSGRYGSU DRDDRDERES RSRRRDSDYK RSSDDRRGDR YDDYRDYDSP 60 
ERERERRNSD RSEDGYHSDG DYGEHDYRHD ISDERESKTI MLRGLF1TIT ESDtREMMES 120 
FEGPQPADVR LMKRKTGESL LSS 



SEQ ID NO:281 PCI2 DNA SEQUENCE 

Nucleic Acid Accession*: AF208291 

Coding sequence: 109-3705 (underlined sequence corresponds to start and stop codon) 
1 11 21 31 41 51 

I II I I I 

CGGCCGCTTT TTTCTCAAGA TGGCAGATTC CCACTGAGGC TGAGGGGGCC GAGCTCGCGC 60 

GCCGCGTTCC CTTCTCCGTT GCCATGAACC GCGGACACCC CGGCCCCGAT GG CCCCCGTG 120 

TACGAAGGTA TGGCCTCACA TGTGCAAGTT TTCTCCCCTC ACACCCTTCA ATCAAGTGCC 180 

TTCTGTAGTG TGAAGAAACT AAAAGTAGAG CCAAGTTCCA ACTGGGACAT GACTGGGTAC 240 

GGCTCCCACA GCAAAGTGTA CAGCCAGAGC AAGAACATAC CACCTTCTCA GCCAGCCTCC 300 

ACAACCGTCA GCACCTCCTT GCCGGTCCCA AACCCAAGCC TACCTTACGA GCAGACCATC 360 

GTCTTCCCAG GAAGCACCGG GCACATCGTG GTCACCTCAG CAAGCAGCAC TTCTGTCACC 420 

GGGCAAGTCC TCGGCGGACC ACACAACCTA ATGCGTCGAA GCACTGTGAG CCTCCTTGAT 480. 

ACCTACCAAA AATGTGGACT CAAGCGTAAG AGCGAGGAGA TCGAGAACAC AAGCAGCGTG 540 

CAGATCATCG AGGAGCATCC ACCCATGATT CAGAATAATG CAAGCGGGGC CACTGTCGCC 600 

ACTGCCACCA CGTCTACTGC CACCTCCAAA AACAGCGGCT CCAACAGCGA GGGCGACTAT 660 

CAGCTGGTGC AGCATGAGGT GCTGTGCTCC ATGACCAACA CCTACGAGGT CTTAGAGTTC 720 

TTGGGCCGAG GGACGTTTGG ACAAGTGGTC AAGTGCTGGA AACGGGGCAC CAATGAGATC 780 

GTAGCCATCA AGATCCTGAA GAACCGCCCA TCCTATGCCC GACAAGGTCA GATTGAAGTG 840 

AGCATCCTGG CCCGGTTGAG CACGGAGAGT GCCGATGACT ATAACTTCGT CCGGGCCTAC 900 

GAATGCTTCC AGCACAAGAA CCACACGTGC TTGGTCTTCG AGATGTTGGA GCAGAACCTC 960 

TATGACTTTC TGAAGCAAAA CAAGTTTAGC CCCTTGCOCC TCAAATACAT TCGCCCAGTT 1020 

CTCCAGCAGG TAGCCACAGC CCTGATGAAA CTCAAAAGCC TAGGTCTTAT CCACGCTGAC 1080 

CTCAAACCAG AAAACATCAT GCTGGTGGAT CCATCTAGAC AACCATACAG AGTCAAGGTC 1140 

ATCGACTTTG GTTCAGCCAG CCACGTCTCC AAGGCTGTGT GCTCCACCTA CTTGCAGTCC 1200 

AGATATTACA GGGCCCCTGA GATCATCCTT GGTTTACCAT TTTGTGAGGC AATTGACATG 1260 

TGGTCCCTGG GCTGTGTTAT TGCAGAATTG TTCCTGGGTT GGCCGTTATA TCCAGGAGCT 1320 

TCGGAGTATG ATCAGATTCG GTATATTTCA CAAACACAGG GTTTGCCTGC TGAATATTTA 1380 

TTAAGCGCCG GGACAAAGAC AACTAGGTTT TTCAACCGTG ACACGGACTC ACCATATCCT 1440 

TTGTGGAGAC TGAAGACACC AGATGACCAT GAAGCAGAGA CAGGGATTAA GTCAAAAGAA 1500 

GCAAGAAAGT ACATTTTCAA CTGTTTAGAT GATATGGCCC AGGTGAACAT GACGACAGAT 1560 

TTGGAAGGGA GCGACATGTT GGTAGAAAAG GCTGACCGGC GGGAGTTCAT TGACCTGTTG 1620 

AAGAAGATGC TGACCATTGA TGCTGACAAG AGAATCACTC CAATCGAAAC CCTGAACCAT 1680 

CCCTTTGTCA CCATGACACA CTTACTCGAT TTTCCCCACA GCACACACGT CAAATCATGT 1740 

TTCCAGAACA TGGAGATCTG CAAGCGTCGG GTGAATATGT ATGACACGGT GAACCAGAGC 1800 

AAAACCCCTT TCATCACGCA CGTGGCCCCC AGCACGTCCA CCAACCTGAC CATGACCTTT 1860 

AACAACCAGC TGACCACTGT CCACAACCAG GCTCCCTCCT CTACCAGTGC CACTATTTCC 1920 

TTAGCCAATC CCGAAGTCTC CATACTAAAC TACCCATCTA CACTCTACCA GCCCTCAGCG 1980 

GCATCCATGG CTGCAGTGGC CCAGCGGAGC ATGCCCCTGC AGACAGGAAC AGCCCAGATT 2040 

TGTGCCCGGC CTGACCCGTT CCAGCAAGCT CTCATCGTGT GTCCCCCCGG CTTCCAAGGC 2100 

TTGCAGGCCT CTCCCTCTAA GCACGCTGGC TACTCGGTGC GAATGGAAAA TGCAGTTCCC 2160 

ATCGTCACTC AAGCCCCAGG AGCTCAGCCT CTTCAGATCC AACCAGGTCT GCTTGCCCAG 2220 

CAGGCTTGGC CAAGTGGGAC CCAGCAGATC CTGCTTCCCC CAGCATGGCA GCAACTGACT 2280 

GGAGTGGCCA CCCACACATC AGTGCAGCAT GCCACCGTGA TTCCCGAGAC CATGGCAGGC 2340 

ACCCAGCAGC TGGCGGACTG GAGAAATACG CATGCTCACG GAAGCCATTA TAATCCCATC 2400 

ATGCAGCAGC CTGCACTATT GACCGGTCAT GTGACCCTTC CAGCAGCACA GCCCTTAAAT 2460 

GTGGGTGTGG CCCACGTGAT GCGGCAGCAG CCAACCAGCA CCACCTCCTC CCGGAAGAGT 2520 

AAGCAGCACC AGTCATCTGT GAGAAATGTC TCCACCTGTG AGGTGTCCTC CTCTCAGGCC 2580 

ATCAGCTCCC CACAGCGATC CAAGCGTGTC AAGGAGAACA CACCTCCCCG CTGTGCCATG 2640 

GTGCACAGTA GCCCGGCCTG CAGCACCTCG GTCACCTGTG GGTGGGGCGA CGTGGCCTCC 2700 

AGCACCACCC GGGAACGGCA GCGGCAGACA ATTGTCATTC CCGACACTCC CAGGCCCACG 2760 

GTCAGCGTCA TCACCATCAG CAGTGACACG GACGAGGAGG AGGAACAGAA ACACGCCCCC 2820 

ACCAGCACTG TCTCCAAGCA AAGAAAAAAC GTCATCAGCT GTGTCACAGT CCACGACTCC 2880 

CCCTACTCCG ACTCCTCCAG CAACACCAGC CCCTACTCCG TGCAGCAGCG TGCTGGGCAC 2940 
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AACAATGCCA ATGCCTTTGA CACCAAGGGG AGCCTGGAGA ATCACTGCAC 
CGAACCATCA TCGTGCCACC CCTGAAAACC CAGGCCAGCG AAGTATTGGT 
AGCCTGGTGC CAGTCAACAC CAGTCACCAC TCGTCCTCCT ACAAGTCCAA 
AACGTGACCT CCACCAGCGG TCACTCTTCA GGGAGCTCAT CTGGAGCCAT 
CAGCAGCGGC CGGGCCCCCA CTTCCAGCAG CAGCAGCCAC TCAATCTCAG 
CAGCACATCA CCACGGACCG CACTGGGAGC CACCGAAGGC AGCAGGCCTA 
ACCATGGCCC AGGCTCCGTA CTCCTTCCCG CACAACAGCC CCAGCCACGG 
CCGCATCTGG CTGCAGCCGC TGCCGCTGCC CACCTCCCCA CCCAGCCCCA 
TACACTGCGC CGGCGGCCCT GGGCTCCACC GGCACCGTGG CCCACCTGGT 
GGCTCTGCGC GCCACACCGT GCAGCACACT GCCTACCCAG CCAGCATCGT 
CCCGTGAGCA TGGGCCCCCG GGTCCTGCCC TCGCCCACCA TCCACCCGAG 
GCCCAATTTG CCCACCAGAC CTACATCAGC GCCTCGCCAG CCTCCACCGT 
TACCCACTGA GCCCCGCCAA GGTCAACCAG TACCCTTACA TATAAACACT 
GAGGGAGGGA GGGAGGGAGA GAATGGCCCG AGGGAGGAGG GAGAGAAGGA 
CCTCGGACCG TGGGCGCTGG CCTTTTATAC TGAAGATGCC GCACACAAAC 
GGGCAGGGGC GGGGGGGGGG GGGGCAGAGG GCAGGGGGAC GGGTCGGGAC 
CTTGAACCGG GAAGTGGGAG GACGTAGAGC AGAGAAGAGA ACATTTTTAA 
TTAAAGAGGG TGGGAAATCT ATGGTTTTTA TTTTAAAAAA 



SEQ ID N0:282 PCI2 Protein sequence: 
Protein Accession #: NP_0735T7 

MAPVYEGMAS HVQVFSPHTL QSS AFCSVKK LKVEPSSNWD MTGYGSHSKV YSQSKNIPPS 60 
QPASTTVSTS LPVPNPSLPY EQTtVFPGST GHTVVTSASS TSVTGQVU3G PHNLMRRSTV 120 
SLLDTYQKCG LKRKSEEIEN TSS VQHEEH PPMIQNNASG ATVATATTST ATSKNSGSNS 180 
EGDYQLVQHE VLCSMTNTYE VLEFLGRGTF GQWKCWKRG TNETVAHOL KNRPS YARQG 240 
QIEYSILARL STES ADDYNF VRAYECFQHK NHTCLVFEML EQNLYDFLKQ NKFSPLPLKY 300 
IRPVLQQVAT AJLMKLKSLGL IHADLKPENI MLVDPSRQPY RVKVIDPGSA SHVSKAVCST 360 
YLQSRYYRAP EnLGLPFCE AIDMWSLGCV IAELFIjGWPL YPGASEYDQI RYISQTQGLP 420 
AEYLLSAGTK TTRFFNRDTD SPYPLWRLKT PDDHEAETGI KSKEARKYIF NCLDDMAQVN 480 
MTTDLEGSDM LVEKADRREF IDLLKKMLTI DADKRITPIE TLNHPFVTMT HLLDFPHSTH 540 
VKSCPQNMEI CKRRVNMYDT VNQSKTPHT HVAPSTSTNL TMTFNNQLTT VHNQAPSSTS 600 
ATOLANPEV SILNYPSTLY QPSAASMAA V AQRSMPLQTG TAQICARPDP FQQAUVCPP 660 
GFQGLQASPS KHAGYSVRME NAVPIVTQAP GAQPLQIQPG ULAQQAWPSG TQQILLPPAW 720 
QQLTGVATHT S VQHATVIPE TMAGTQQLAD WRNTHAHGSH YNPIMQQPAL LTGHVTLPAA 780 
QPLNVGVAHV MRQQPTSTTS SRKSKQHQSS VRNVSTCEVS SSQAISSPQR SKRVKENTPP 840 
RCAMVHSSPA CSTSVTCGWG DVASSTTRER QRQTTVIPDT PSPTVS VITI SSDTDEEEEQ 900 
KHAPTSTVSK QRKNVISCVT VHDSPYSDSS SNTSPYSVQQ RAGHNNANAF DTKGSLENHC 960 
TGNPRTUVP PLKTQASEVL VECDSLVPVN TSHHSSS YKS KSSSNVTSTS GHSSGSSSGA 1020 
ITYRQQRPGP HFQQQQPLNL SQAQQHTTTD RTGSHRRQQA YTTPTMAQAP YSFPHNSPSH 1080 
GTVHPHLAAA AAAAHLPTQP HLYTYTAPAA LGSTGTVAHL VASQGSARHT VQHTAYPASI 1140 
VHQVPVSMGP RVLPSPTIHP SQYPAQFAHQ TYKASPAST VYTGYPLSPA KVNQYPYI 

SEQ (D N0:283 PBY1 DNA SEQUENCE 

Nucleic Add Accession* NMJM7700 

Coding sequence: 147-806 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

AGTCACAGCC AGGTAACCCT GGAGTGAAGC GGTTTAGTTA GAAGGGAGCA GATAAACTCG 60 

TCACTCTAGT AGCTTTAACC CTCACCCTGA GGCACCTTAG CAATCAGCCA TTGCCTGCAA 120 

GCCTCCAAAG CTTGTCTTTG CCTAATATGG AGCCCAAAGA AGCCACTGGG AAAGAAAACA 180 

TGGTCACCAA GAAAAAGAAT CTGGCCTTCT TGAGGTCTAG ACTCTATATG CTGGAGAGAA 240 

GGAAGACTGA CACTGTGGTT GAGAGCAGTG TTTCTGGGGA CCACTCTGGC ACCTTGAGGA 300 

GGAGCCAATC TGACAGGACC GAATACAACC AGAAATTACA AGAAAAGATG ACTCCACAGG 360 

GTGAGTGTTC TGTAGCTGAG ACCTTAACCC CAGAGGAAGA GCATCATATG AAGAGGATGA 420 

TGGCAAAGCG GGAAAAGATC ATTAAGGAGC TGATACAGAC AGAAAAGGAT TATCTCAATG 480 

ATCTAGAGCT GTGTGTTAGG GAAGTGGTTC AGCCCCTGAG AAATAAAAAG ACTGATAGGC 540 

TGGATGTGGA TAGCTTGTTT AGCAACATTG AGTCCGTGCA TCAGATATCA GCCAAGCTGC 600 

TGTCATTGTT GGAAGAGGCC ACAACAGACG TGGAACCGGC CATGCAAGTA ATTGGAGAAG 660 

TATTCTTGCA GATTAAAGGG CCACTGGAAG ATATTTATAA AATCTACTGC TATCACCATG 720 

ATGAAGCACA TAGTATACTG GAGTCCTATG AAAAGGAAGA AGAGCTGAAG GAACATTTGA 780 

GCCACTGTAT CCAGTCCTTA AAGTAAGGCC TTTTCAAATG ATGATTCCCA TCTCCTCTCA 840 

GTTGCCTAGC AGGGAACATT TTAAATGGAT GTAGATGAAA GGTCTCACAT AAATCCTATG 900 

TTTTATGAGA CTTGCTGGGA GCTCTGCTTT GCATTCCCTT TATAAAAAGC TGACATGCCA 960 

GAAGCCCTGA TTGACTTTTT TTCCCCCTGC GAGAATGACT AAAAATAACA TGGAAGAAGA 1020 

TTTAGAGCTC TGCAGCGATT GAAAAATGCA ATATCAAAAT ATAAAATGTG GAAGAAAAGC 1080 

CTCTTCTTAA AGCTATTGTA ACTTGCCTGG CCCCACGTAG TTCAAGGATT ATGTGAGATA 1140 

ACACGTGGQC CCATGACCAC TGGAGCACAT GGGTTAATGG AGTTAGGGGA ATGGCCTACA 1200 

ACTCTGCATG GCCGTCTTCT TTCCCCAAAC TCACTGTGGG GAGATGGGTG AAGACAAGTC 1260 

AGGCCTTGTT AAAGTTAGTT TCAGAACAAT TACTCATGCC TTCCTTTCTC ATCCCTAAAA 1320 

CATTGGTGGG GGAGCTACAC AATGTACTTT TTCTTTTCTA GAGGAAGTAT CTATTCACTG 1380 

TGAAAATCTG AAAAATATAA CAAAGTATGT GTAAGATAAA AACCCCTTGC TATTTCAAAA 1440 
AAAAAAAAAA AAAAAAAAAA AAAA 

SEQ ID MQ484 PBY1 Protein sequence: 
Protein Accession #: NP.060170 

1 11 21 31 41 51 

420 



GGGGAACCCC 3000 

GGAGTGTGAT 3060 

GTCCTCCAGC 3120 

CACCTACCGG 3180 

CCAGGCTCAG 3240 

CATCACTCCC 3300 

CACTGTGCAC 3360 

CCTCTACACC 3420 

GGCCTCGCAA 3480 

CCACCAGGTC 3540 

TCAGTATCCA 3600 

CTACACTGGA 3660 

GGAGGGGAGG 3720 

GGGAGGCGCT 3780 

AATGCAAACG 3840 

ACCAGTGAAA 3900 

AAGGAAGGGA 3960 
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MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 

NQKLQEKMTP QGECSVAETL TPEEEHHMKR MMAKREKIIK ELIQTEKDYL NDLELCVREV 120 

VQPLPNKKTD RLDVDSLFSN IESVHQISAK LLSLLEEATT DVEPAMQVIG EVFLQIKGPL 180 
EDIYKIYCYH HDEAHSHJES YEKEEELKEH LSHCIQSLK 



SEQ ID NO:285 PBQ9 DNA SEQUENCE 

Nucleic Acid Accession*: X66534 

Coding sequence: 523-2676 (underlined sequence corresponds to start and stop cotton) 



1 11 21 31 41 51 

I I I I I I 

CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 

CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 

TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 

ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 

ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 

GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 

TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 

TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 

AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 

AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCOCATC 660 

TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 

GAACGGCTGA ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 

AAATCTTTGG AAAGAGAAGA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 

CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGGAAG 960 

ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 

TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 

AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 

ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAGTTTGTG 1260 

AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 

AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 

CATTTCATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 

ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 

AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 

GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 

ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 

TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 

AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 

GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 

ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 

TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 

TACACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 

ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 

GCGCTGATGG CCCTGAAGAT GATGGAGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 

CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 

AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 

TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 

CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 

ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 

TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 

TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 

GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 

CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 

TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 

AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2880 

AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 

TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

SEQ m N0:286 PBQ9 Protein sequence: 
Protein Accession*: Q02108 

1 u 21 31 41 51 

I I I I I I 

MFCTKXKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 

QRKTSRSRVY LHTLAESICK LIFPEFERLN VALORTLAKH K1KESRKSLE REDFEKTIAE 120 

QAVAAGVPVE VIKESLGEEV FKICYEEDEN ILGWGGTLK DFLNSFSTLL KQSSHCQEAG 180 

KRGRLEDASI LCLDKEDDFL HVYYFFPKRT TSLILPGIIK AAAHVLYETE VEVSLMPPCF 240 

HNDCSEFVNQ PYLLYSVHMK STKPSLSPSK PQSSLVIPTS LFCKTFPFHF MFDKDMTILQ 300 

FGNGIRRLMN RRDFQGKPNF EEYFEILTPK INQTFSGIMT MLNMQFWRV RRWDNSVKKS 360 

SRVKDLKGQM IYIVESSAIL FLGSPCVDRL EDFTGRGLYL SDIPIHNALR DWLIGEQAR 420 

AQDGLKKRLG KLKATLEQAH QALEEEKKKT VDLLCSIFPC EVAQQLWQGQ WOAKKFSNV 480 

TMLFSDXVGF TAICSQCSPL QVITMLNALY TRFDQQCGEL DVYKVETIGD AYCVAGGLHK 540 
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ESDTHAVQIA LMALKMMELS DEVMSPHGEP IKMRIGLHSG SVFAGWGVK MPRYCLFGNN 600 
VTLANKFESC SVPRKINVSP TTYRLLKDCP GFVFTPRSRE ELPPNFPSEI PGICHFLDAY- 660 
QQGTNSKPCF QKKDVEDGNA NFLGKASGID 



SEQ ID N&287 PFD2 ONA SEQUENCE 

Nuclec Acid Accession*: NM.000720 

Coding sequence: 1 19-6664 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I I 

AGAATAAGGG CAGGGACCGC GGCTCCTATC TCTTGGTGAT CCCCTTCCCC ATTCCGCCCC 60 

CGCCTCAACG CCCAGCACAG TGCCCTGCAC ACAGTAGTCG CTCAATAAAT GTTCGTGGAT 120 

GATGATGATG ATGATGATGA AAAAAATGCA GCATCAACGG CAGCAGCAAG CGGACCACGC 180 

GAACGAGGCA AACTATGCAA GAGGCACCAG ACTTCCTCTT TCTGGTGAAG GACCAACTTC 240 

TCAGCCGAAT AGCTCCAAGC AAACTGTCCT GTCTTGGCAA GCTGCAATCG ATGCTGCTAG 300 

ACAGGCCAAG GCTGCCCAAA CTATGAGCAC CTCTGCACCC CCACCTGTAG GATCTCTCTC 360 

CCAAAGAAAA CGTCAGCAAT ACGCCAAGAG CAAAAAACAG GGTAACTCGT CCAACAGCCG 420 

ACCTGCCCGC GCCCTTTTCT GTTTATCACT CAATAACCCC ATCCGAAGAG CCTGCATTAG 480 

TATAGTGGAA TGGAAACCAT TTGACATATT TATATTATTG GCTATTTTTG CCAATTGTGT 540 

GGCCTTAGCT ATTTACATCC CATTCCCTGA AGATGATTCT AATTCAACAA ATCATAACTT 600 

GGAAAAAGTA GAATATGCCT TCCTGATTAT TTTTACAGTC GAGACATTTT TGAAGATTAT 660 

AGCGTATGGA TTATTGCTAC ATCCTAATGC TTATGTTAGG AATGGATGGA ATTTACTGGA 720 

TTTTGTTATA GTAATAGTAG GATTGTTTAG TGTAATTTTG GAACAATTAA CCAAAGAAAC 780 

AGAAGGCGGG AACCACTCAA GCGGCAAATC TGGAGGCTTT GATGTCAAAG CCCTCCGTGC 840 

CTTTCGAGTG TTGCGACCAC TTCGACTAGT GTCAGGGGTG CCCAGTTTAC AAGTTGTCCT 900 

GAACTCCATT ATAAAAGCCA TGGTTCCCCT CCTTCACATA GCCCTTTTGG TATTATTTGT 960 

AATCATAATC TATGCTATTA TAGGATTGGA ACTTTTTATT GGAAAAATGC ACAAAACATG 1020 

TTTTTTTGCT GACTCAGATA TCGTAGCTGA AGAGGACCCA GCTCCATGTG CGTTCTCAGG 1080 

GAATGGACGC CAGTGTACTG CCAATGGCAC GGAATGTAGG AGTGGCTGGG TTGGCCCGAA 1140 

CGGAGGCATC ACCAACTTTG ATAACTTTGC CTTTGCCATG CTTACTGTGT TTCAGTGCAT 1200 

CACCATGGAG GGCTGGACAG ACGTGCTCTA CTGGGTAAAT GATGCGATAG GATGGGAATG 1260 

GCCATGGGTG TATTTTGTTA GTCTGATCAT CCTTGGCTCA TTTTTCGTCC TTAACCTGGT 1320 

TCTTGGTGTC CTTAGTGGAG AATTCTCAAA GGAAAGAGAG AAGGCAAAAG CACGGGGAGA 1380 

TTTCCAGAAG CTCCGGGAGA AGCAGCAGCT GGAGGAGGAT CTAAAGGGCT ACTTGGATTG 1440 

GATCACCCAA GCTGAGGACA TCGATCCGGA GAATGAGGAA GAAGGAGGAG AGGAAGGCAA 1500 

ACGAAATACT AGCATGCCCA CCAGCGAGAC TGAGTCTGTG AACACAGAGA ACGTCAGCGG 1560 

TGAAGGCGAG AACCGAGGCT GCTGTGGAAG TCTCTGGTGC TGGTGGAGAC GGAGAGGCGC 1620 

GGCCAAGGCG GGGCCCTCTG GGTGTCGGCG GTGGGGTCAA GCCATCTCAA AATCCAAACT 1680 

CAGCCGACGC TGGCGTCGCT GGAACCGATT CAATCGCAGA AGATGTAGGG CCGCCGTGAA 1740 

GTCTGTCACG TTTTACTGGC TGGTTATCGT CCTGGTGTTT CTGAACACCT TAACCATTTC 1800 

CTCTGAGCAC TACAATCAGC CAGATTGGTT GACACAGATT CAAGATATTG CCAACAAAGT 1860 

CCTCTTGGCT CTGTTCACCT GCGAGATGCT GGTAAAAATG TACAGCTTGG GCCTCCAAGC 1920 

ATATTTCGTC TCTCTTTTCA ACCGGTTTGA TTGCTTCGTG GTGTGTGGTG GAATCACTGA 1980 

GACGATCCTG GTGGAACTGG AAATCATGTC TCCCCTGGGG ATCTCTGTGT TTCGGTGTGT 2040 

GCGCCTCTTA AGAATCTTCA AAGTGACCAG GCACTGGACT TCCCTGAGCA ACTTAGTGGC 2100 

ATCCTTATTA AACTCCATGA AGTCCATCGC TTCGCTGTTG CTTCTGCTTT TTCTCTTCAT 2160 

TATCATCTTT TCCTTGCTTG GGATGCAGCT GTTTGGCGGC AAGTTTAATT TTGATGAAAC 2220 

GCAAACCAAG CGGAGCACCT TTGACAATTT CCCTCAAGCA CTTCTCACAG TGTTCCAGAT 2280 

CCTGACAGGC GAAGACTGGA ATGCTGTGAT GTACGATGGC ATCATGGCTT ACGGGGGCCC 2340 

ATCCTCTTCA GGAATGATCG TCTGCATCTA CTTCATCATC CTCTTCATTT GTGGTAACTA 2400 

TATTCT ACTG AATGTCTTCT TGGCCATCGC TGTAGACAAT TTGGCTGATG CTGAAAGTCT 2460 

GAACACTGCT CAGAAAGAAG AAGCGGAAGA AAAGGAGAGG AAAAAGATTG CCAGAAAAGA 2520 

GAGCCTAGAA AATAAAAAGA ACAACAAACC AGAAGTCAAC CAGATAGCCA ACAGTGACAA 2580 

CAAGGTTACA ATTGATGACT ATAGAGAAGA GGATGAAGAC AAGGACCCCT ATCCGCCTTG 2640 

CGATGTGCCA GTAGGGGAAG AGGAAGAGGA AGAGGAGGAG GATGAACCTG AGGTTCCTGC 2700 

CGGACCCCGT CCTCGAAGGA TCTCGGAGTT GAACATGAAG GAAAAAATTG CCCCCATCCC 2760 

TGAAGGGAGC GCTTTCTTCA TTCTTAGCAA GACCAACCCG ATCCGCGTAG GCTGCCACAA 2820 

GCTCATCAAC CACCACATCT TCACCAACCT CATCCTTGTC TTCATCATGC TGAGCAGCGC 2880 

TGCCCTGGCC GCAGAGGACC CCATCCGCAG CCACTCCTTC CGGAACACGA TACTGGGTTA 2940 

CTTTGACTAT GCCTTCACAG CCATCTTTAC TGTTGAGATC CTGTTGAAGA TGACAACTTT 3000 

TGGAGCTTTC CTCCACAAAG GGGCCTTCTG CAGGAACTAC TTCAATTTGC TGGATATGCT 3060 

GGTGGTTGGG GTGTCTCTGG TGTCATTTGG GATTCAATCC AGTGCCATCT CCGTTGTGAA 3120 

GATTCTGAGG GTCTTAAGGG TCCTGCGTCC CCTCAGGGCC ATCAACAGAG CAAAAGGACT 3180 

TAAGCACGTG GTCCAGTGCG TCTTCGTGGC CATCCGGACC ATCGGCAACA TCATGATCGT 3240 

CACTACCCTC CTGCAGTTCA TGTTTGCCTG TATCGGGGTC CAGTTGTTCA AGGGGAAGTT 3300 

CTATCGCTGT ACGGATGAAG CCAAAAGTAA CCCTGAAGAA TGCAGGGGAC TTTTCATCCT 3360 

CTACAAGGAT GGGGATGTTG ACAGTCCTGT GGTCCGTGAA CGGATCTGGC AAAACAGTGA 3420 

TTTCAACTTC GACAACGTCC TCTCTGCTAT GATGGCGCTC TTCACAGTCT CCACGTTTGA 3480 

GGGCTGGCCT GCGTTGCTGT ATAAAGCCAT CGACTCGAAT GGAGAGAACA TCGGCCCAAT 3540 

CTACAACCAC CGCGTGGAGA TCTCCATCTT CTTCATCATC TACATCATCA TTGTAGCTTT 3600 

CTTCATGATG AACATCTTTG TGGGCTTTGT CATCGTTACA TTTCAGGAAC AAGGAGAAAA 3660 

AGAGTATAAG AACTGTGAGC TGGACAAAAA TCAGCGTCAG TGTGTTGAAT ACGCCTTGAA 3720 

AGCACGTCCC TTGCGGAGAT ACATCCCCAA AAACCCCTAC CAGTACAAGT TCTGGTACGT 3780 

GGTGAACTCT TCGCCTTTCG AATACATGAT GTTTGTCCTC ATCATGCTCA ACACACTCTG 3840 

CTTGGCCATG CAGCACTACG AGCAGTCCAA GATGTTCAAT GATGCCATGG ACATTCTGAA 3900 

CATGGTCTTC ACCGGGGTGT TCACCGTCGA GATGGTTTTG AAAGTCATCG CATTTAAGCC 3960 

TAAGGGGTAT TTTAGTGACG CCTGGAACAC GTTTGACTCC CTCATCGTAA TCGGCAGCAT 4020 

TATAGACGTG GCCCTCAGCG AAGCGGACCC AACTGAAAGT GAAAATGTCC CTGTCCCAAC 4080 
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TGCTACACCT GGGAACTCTG AAGAGAGCAA TAGAATCTCC ATCACCTTTT TCCGTCTTTT 4140 

CCGAGTGATG CGATTGGTGA AGCTTCTCAG CAGGGGGGAA GGCATCCGGA CATTGCTGTG 4200 

GACTTTTATT AAGTCCTTTC AGGCGCTCCC GTATGTGGCC CTCCTCATAG CCATGCTGTT 4260 

CTTCATCTAT GCGGTCATTG GCATGCAGAT GTTTGGGAAA GTTGCCATGA GAGATAACAA 4320 

CCAGATCAAT AGGAACAATA ACTTCCAGAC GTTTCCCCAG GCGGTGCTGC TGCTCTTCAG 4380 

GTGTGCAACA GGTGAGGCCT GGCAGGAGAT CATGCTGGCC TGTCTCCCAG GGAAGCTCTG 4440 

TGACCCTGAG TCAGATTACA ACCCCGGGGA GGAGTATACA TGTGGGAGCA ACTTTGCCAT 4500 

TGTCTATTTC ATCAGTTTTT ACATGCTCTG TGCATTTCTG ATCATCAATC TGTTTGTGGC 4560 

TGTCATCATG GATAATTTCG ACTATCTGAC CCGGGACTGG TCTATTTTGG GGCCTCACCA 4620 

TTTAGATGAA TTCAAAAGAA TATGGTCAGA ATATGACCCT GAGGCAAAGG GAAGGATAAA 4680 

ACACCTTGAT GTGGTCACTC TGCTTCGACG CATCCAGCCT CCCCTGGGGT TTGGGAAGTT 4740 

ATGTCCACAC AGGGTAGCGT GCAAGAGATT AGTTGCCATG AACATGCCTC TCAACAGTGA 4800 

CGGGACAGTC ATGTTTAATG CAACCCTGTT TGCTTTGGTT CGAACGGCTC TTAAGATCAA 4860 

GACCGAAGGG AACCTGGAGC AAGCTAATGA AGAACTTCGG GCTGTGATAA AGAAAATTTG 4920 

GAAGAAAACC AGCATGAAAT TACTTGACCA AGTTGTCCCT CCAGCTGGTG ATGATGAGGT 4980 

AACCGTGGGG AAGTTCTATG CCACTTTCCT GATACAGGAC TACTTTAGGA AATTCAAGAA 5040 

ACGGAAAGAA CAAGGACTGG TGGGAAAGTA CCCTGCGAAG AACACCACAA TTGCCCTACA 5100 

GGCGGGATTA AGGACACTGC ATGACATTGG GCCAGAAATC CGGCGTGCTA TATCGTGTGA 5160 

TTTGCAAGAT GACGAGCCTG AGGAAACAAA ACGAGAAGAA GAAGATGATG TGTTCAAAAG 5220 

AAATGGTGCC CTGCTTGGAA ACCATGTCAA TCATGTTAAT AGTGATAGGA GAGATTCCCT 5280 

TCAGCAGACC AATACCACCC ACCGTCCCCT GCATGTCCAA AGGCCTTCAA TTOCACCTGC 5340 

AAGTGATACT GAGAAACCGC TGTTTCCTCC AGCAGGAAAT TCGGTGTGTC ATAACCATCA 5400 

TAACCATAAT TCCATAGGAA AGCAAGTTCC CACCTCAACA AATGCCAATC TCAATAATGC 5460 

CAATATGTCC AAAGCTGCCC ATGGAAAGCG GCCCAGCATT GGGAACCTTG AGCATGTGTC 5520 

TGAAAATGGG CATCATTCTT CCCACAAGCA TGACCGGGAG CCTCAGAGAA GGTCCAGTGT 5580 

GAAAAGAACC OGCTATTATG AAACTTACAT TAGGTCCGAC TCAGGAGATG AACAGCTCCC 5640 

AACTATTTGC CGGGAAGACC CAGAGATACA TGGCTATTTC AGGGAOCCCC ACTGCTTGGG 5700 

GGAGCAGGAG TATTTCAGTA GTGAGGAATG CTACGAGGAT GACAGCTCGC CCACCTGGAG 5760 

CAGGCAAAAC TATGGCTACT ACAGCAGATA CCCAGGCAGA AACATCGACT CTGAGAGGCC 5820 

CCGAGGCTAC CATCATCCCC AAGGATTCTT GGAGGACGAT GACTCGCCCG TTTGCTATGA 5880 

TTCACGGAGA TCTCCAAGGA GACGCCTACT ACCTCCCACC CCAGCATCCC ACCGGAGATC 5940 

CTCCTTCAAC TTTGAGTGCC TGCGCCGGCA GAGCAGCCAG GAAGAGGTCC CGTCGTCTCC 6000 

CATCTTCCCC CATCGCACGG CCCTGCCTCT GCATCTAATG CAGCAACAGA TCATGGCAGT 6060 

TGCCGGCCTA GATTCAAGTA AAGCCCAGAA GTACTCACCG AGTCACTCGA CCCGGTCGTG 6120 

GGCCACCCCT CCAGCAACCC CTCCCTACCG GGACTGGACA CCGTGCTACA CCCCCCTGAT 6180 

CCAAGTGGAG CAGTCAGAGG CCCTGGACCA GGTGAACGGC AGCCTGCCGT CCCTGCACCG 6240 

CAGCTCCTGG TACACAGACG AGCCCGACAT CTCCTACCGG ACTTTCACAC CAGCCAGCCT 6300 

GACTGTCCCC AGCAGCTTCC GGAACAAAAA CAGCGACAAG CAGAGGAGTG CGGACAGCTT 6360 

GGTGGAGGCA GTCCTGATAT CCGAAGGCTT GGGACGCTAT GCAAGGGACC CAAAATTTGT 6420 

GTCAGCAACA AAACACGAAA TCGCTGATGC CTGTGACCTC ACCATCGACG AGATGGAGAG 6480 

TGCAGCCAGC ACCCTGCTTA ATGGGAACGT GCGTCCCCGA GCCAACGGGG ATGTGGGCCC 6540 

CCTCTCACAC CGGCAGGACT ATGAGCTACA GGACTTTGGT CCTGGCTACA GCGACGAAGA 6600 

GCCAGACCCT GGGAGGGATG AGGAGGACCT GGCGGATGAA ATGATATGCA TCACCACCTT 6660 

GTAGCCCCCA GCGAGGGGCA GACTGGCTCT GGCCTCAGGT GGGGCGCAGG AGAGCCAGGG 6720 

GAAAAGTGCC TCATAGTTAG GAAAGTTTAG GCACTAGTTG GGAGTAATAT TCAATTAATT 6780 

AGACTTTTGT ATAAGAGATG TCATGCCTCA AGAAAGCCAT AAACCTGGTA GGAACAGGTC 6840 

CCAAGCGGTT GAGCCTGGCA GAGTACCATG CGCTCGGCCC CAGCTGCAGG AAACAGCAGG 6900 

CCCCGCCCTC TCACAGAGGA TGGGTGAGGA GGCCAGACCT GCCCTGCCCC ATTGTCCAGA 6960 

TGGGCACTGC TGTGGAGTCT GCTTCTCCCA TGTACCAGGG CACCAGGCCC ACCCAACTGA 7020 

AGGCATGGCG GCGGGGTGCA GGGGAAAGTT AAAGGTGATG ACGATCATCA CACCTCGTGT 7080 

CGTTACCTCA GCCATCGGTC TAGCATATCA GTCACTGGGC CCAACATATC CATTTTTAAA 7140 
CCCTTTCCCC CAAATACACT GCGTCCTGGT TCCTGTTTAG CTGTTCTGAA ATA 

SEQ ID NO;2B8 PFD2 Protein sequence: 
Protein Accession* A38198 



1 11 21 31 41 51 

I I I I I I 

MMMMMMHKKM QHQRQQQADH ANEANYARGT RLPLSGEGPT SQPNSSKQTV LSWQAAIDAA 60 

RQAKAAQTMS TSAFPPVGSL SQRKRQQYAK SKKQGNSSNS RPARALFCLS LNNPIRRACI 120 

SIVEWKPFDI FILLAIFANC VALAIYIPFP EDDSNSTNHN LEKVEYAFLI IFTVETFLKI 180 

IAYGLLLHPN AYVRNGWNLL DFVIVIVGLF SVILEQLTKE TEGGKHSSGK SGGFDVKALR 240 

AFRVLRPLRL VSGVPSLQW LNSIIKAMVP LLHIALLVLF VIIIYAIIGL ELFIGKMHKT 300 

CFFADSDIVA EEDPAPCAFS GNGRQCTANG TECRSGWVGP NGGITNFDNF AFAMLTVFQC 360 

ITMEGWTDVL YWVNDAIGWE WPWVYFVSLI ILGSFFVLNL VLGVLSGEFS KEREKAKARG 420 

DFQKLREKQQ LEEDLKGYLD WITQAEDIDP ENEEEGGEEG KRNTSMPTSE TESVNTENVS 480 

GEGENRGCCG SLWCWWRRRG AAKAGPSGCR RWGQAISKSK LSRRWRRWNR FNRRRCRAAV 540 

KSVTFYWLVI VLVFLNTLTI SSEHYNQPDW LTQIQDIANK VLLALFTCEM LVKMYSLGLQ 600 

AYFVSLFNRF DCFWCGGIT ETILVELEIM SPLGISVFRC VRLLRIFKVT RHWTSLSNLV 660 

ASLLMSMKSI ASLLLLLFLF IIIFSLLGMQ LFGGKFNFDE TQTKRSTFDN FPQALLTVFQ 720 

ILTGEDWNAV MYDGIMAYGG PSSSGMIVCI YFIILFICGN YILLNVFLAI AVDNLADAES 780 

LNTAQKEEAE EKERKKIARK ESLENKKNNK PEVNQIANSD NKVTIDDYRE EDEDKDPYPP 840 

CDVPVGEEEE EEEEDEPEVP AGPRPRRISE LNMKEKIAPI PEGSAFFILS KTNPIRVGCH 900 

KLINHHIFTN LILVFIMLSS AALAAEDPIR SHSFRNTILG YFDYAFTAIF TVEILLKMTT 960 

FGAFLHKGAF CRNYFNLLDM LWGVSLVSF GIQSSAISW KILRVLRVLR PLRAINRAKG 1020 

LKHWQCVFV AIRTIGNIMI VTTLLQFHFA CIGVQLFKGK FYRCTDEAKS NPEECRGLFI 1080 

LYKDGDVDSP WRERIWQNS DFNFDNVLSA MMALFTVSTF EGWPALLYKA IDSNGENIGP 1140 

IYNHRVEISI FFIIYIIIVA FFMMNIFVGF VIVTFQEQGE KEYKNCELDK NQRQCVEYAL 1200 

KARPLRRYIP KNPYQYKFWY WNSSPFEYM MFVLIMLNTL CLAMQHYEQS KMFNDAMDIL 1260 

NMVFTGVFTV EMVLKVIAFK PKGYFSDAWN TFDSLIVIGS IIDVALSEAD PTESENVPVP 1320 
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TATPGNSEES NRISITFFRL FRVMRLVKLL SRGEGIRTLL WTFIKSFQAL PYVALLIAML 1380 

FFIYAVIGMQ MFGKVAMRDN NQINRNNNFQ TFPQAVLLLF RCATGEAWQE IMLACLPGKL 1440 

CDPESDYNPG EEYTCGSNFA IVYFISFYML CAFL1INLFV AVIMDNFDYL TRDWSILGPH 1500 

HLDEFKRIWS EYDPEAKGRI KHLDWTLLR RIQPPLGFGK LCPHRVACKR LVAMNMPLNS 1560 

DGTVMFNATL FALVRTALKI KTEGNLEQAN EELRAVTKKI WKKTSMKLLD QWPPAGDDE 1620 

VTVGKFYATF LIQDYFRKFK KRKEQGLVGK YPAKNTTIAL QAGLRTLHDI GPEIRRAISC 1680 

DLQDDEPEET KREEEDDVFK RNGALLGNHV NHVNSDRRDS LQQTNTTHRP LHVQRPSIPP 1740 

ASDTEKPLFP PAGNSVCHNH HNHNSIGKQV PTSTNANLNN ANMSKAAHGK RPSIGNLEHV 1800 

SENGHHSSHK HDREPQRRSS VKRTRYYETY IRSDSGDEQL PTICREDPEI HGYFRDPHCL I860 

GEQEYFSSEE CYEDDSSPTW SRQNYGYYSR YPGRNIDSER PRGYHHPQGF LEDDDSPVCY 1920 

DSRRSPRRRL LPPTPASHRR SSFNFECLRR QSSQEEVPSS PIFPHRTALP LHLMQQQIMA 1980 

VAGLDSSKAQ KYSPSHSTRS WATPPATPFY RDWTPCYTPL IQVEQSEALD QVNGSLPSLH 2040 

RSSWYTDEPD ISYRTFTPAS LTVPSSFRNK NSDKQRSADS LVEAVLISEG LGRYARDPKF 2100 

VSATKHEIAD ACDLTIDEME SAASTLLNGN VRFRANGDVG PLSHRQDYEL QDFGPGYSDE 2160 
EPDPGRDEED LADEMICITT L 

SEO ID N0:289 0BI6 DNA SEQUENCE 

Nucleic Add Accession*: NMJW2812 

Coding sequence: 150-3362 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG OGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GCGTCCGCCT CCTGTGCCCG CCGCGGAGCA GTCTGCGGCC CGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCCGCCGC GA TGG GAGCTGC GCGGGGATCC CCGGCCAGAC 180 

CCCGCCGGTT GCCTCTGCTC AGCGTCCTGC TGCTGCCGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGCCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA CGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGACCGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 

TCCATGACGT GGCCCCTGAG GACTCAGGCC GCTACACCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCCG GAGGAGTCGG 2220 

AGGGCCCTGG CAGCCCTCCC CCCTACAAG A TGATCCAGAC CATTGGGTTG TCGGTGGGTG 2280 

CCGCTGTGGC CTACATCATT GQCGTGCTGG GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC AGAGATGGAA TGCCTCAACG 2400 

GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA AGAAGAAGTG GCCTTGACCA 2460 

GCTTGGGCTC CGGCCCCGCG GCCACCAACA AACGCCACAG CACAAGTGAT AAGATGCACT 2520 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGGGAA GAGTGAGTTT GGGOAGGTGT 2580 

TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA GACCCTGGTA CTTGTGAAGA 2640 

GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG GAGGGAGTTG GAGATGTTTG 2700 

GGAAGCTGAA CCACGCCAAC GTGGTGCGGC TCCTGGGGCT GTGCCGGGAG GCTGAGCCCC 2760 

ACTACATGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA GCAGTTCCTG AGGATTTCCA 2820 

AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC CAAGCAGAAG GTGGCCCTAT 2880 

GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA CCGCTTTGTG CATAAGGACT 2940 

TGGCTGCGCG TAACTGCCTG GTCAGTGCCC AGAGACAAGT GAAGGTGTCT GCCCTGGGCC 3000 

TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCCG CCAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC CCCCGAGGCC ATCCTGGAGG GTGACTTCTC TACCAAGTCT GATGTCTGGG 3120 

CCTTCGGTGT GCTGATGTGG GAAGTGTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCCCGAGG 3240 

GCTGCCCTTC CAAACTCTAT CGGCTGATGC AGCGCTGCTG GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TGGGAGACAG CACCGTGGAC AGCAAGCCGT 3360 

GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA CATCTCTAGA GGGAAGCTCA 3420 
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CAGCATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG GTGCCCTAGT GCAACAGGCA 3480 

TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

GGCTGACTTG GACCCAAACT GGGCGACTAG GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

CTCTTCCTCT ATCAGGGACA GTGTGGGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

TTCTCCCCTT GACCGGGTCC AACTCTGCCA CTCATCTGCC AACTTTGCCT GGGGAGGGCT 3720 

AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA TTCTCAAGTT CTGGGCACAC 3780 

AGGGTTAATG AGTCTCTTGC CCACTGGTCC ACTTGGGGGT CTAGACCAGG ATTATAGAGG 3840 

ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGTGCACA CTGACCCAGA CCCACGTCTT 3900 

CCCCACCCTT CTCTCCTTTC CTCATCCTAA GTGCCTGGCA GATGAAGGAG TTTTCAGGAG 3960 

CTTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC GGGCGGCTTT TATATGTAAT 4020 

TGCAGCGTGG GGTGGGTGGG CATGGGAGGT AGGGGTGGGC CCTGGAGATG AGGAGGGTGG 4080 

GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT TTGTTTTGTT TTTTTGTTTT 4140 
TGTTTTTGTT TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 

SEQ ID NO^OQPig Pro^n §W"gncg; 
Protein Accession #: NPJJQ2B12 

1 11 21 31 41 51 

I I I I I I 

MGAARGSPAR PRRLPLLSVL LLPLLGGTQT AIVFIKQPSS QDALQGRRAL LRCEVEAPGP 60 

VHVYWLLDGA PVQDTERRFA CGSSLSFAAV DRLQDSGTFQ CVARDDVTGE EARSANASFN 120 

IKWIEAGPW LKHPASEAEI QPQTQVTLRC HIDGHPRPTY QWFRDGTPLS DGQSNHTVSS 180 

KERNLTLRPA GPEHSGLYSC CAHSAFGQAC SSQNFTLSIA DESFARWLA PQDVWARYE 240 

EAMFHCQPSA QPPPSLQWLF EDETPITNRS RPPHLRRATV PANGSLLLTQ VRPRNAGIYR 300 

CIGQGQRGPP IILEATLHLA EIEDMPLFEP RVFTAGSEER VTCU>PKGLP EPSVWWEHAG 360 

VRLPTHGRVY QKGHELVLAN IAESDAGVYT CHAANLAGCR RQ0VNITVAT VPSWLKKPQD 420 

SQLEEGKPGY LDCLTQATPK PTWWYRNQM LISEDSRFEV FKNGTLRINS VEVYDGTWYR 480 

CMSSTPAGSI EAQARVQVLE KLKFTPPPQP QQCMEFDKEA TVPCSATGRE KPTZKWERAD 540 

GSSLPEWVTD KAGTLHFARV TRDDAGNYTC IASNGPQGQI RAHVQLTVAV FITFKVEPER 600 

TTVYQGHTAL LQCEAQGDPK PLIQWKGKDR ILDFTKLGPR MHIFQNGSLV IHDVAPEDSG 660 

RYTCIAGNSC NIKHTEAPLY WDKPVPEES EGPGSPPPYK MIQTIGLSVG AAVAYIIAVL 720 

GLMFYCKKRC KAKRLQKQPE GEEPEMECLN GGPIiQNGQPS AEIQEEVALT SLGSGPAATN 780 

KRHSTSDKMH FPRSSLQPIT TLGKSEFGEV FLAKAQGLEE GVAETLVLVK SLQTKDEQQQ 840 

LDFRRELEMF GKLNHANWR LI/3LCREAEP HYMVLEYVDL GDLKQFLRIS KSKDEKLKSQ 900 

PLSTKQKVAL CTQVALGMEH LSNNRFVHKD LAARNCLVSA QRQVKVSALG LSKDVYNSEY 960 

YHFRQAWVPL RWMSPEAILE GDFSTKSDVW AFGVLMWEVF THGEMPHGGQ ADDEVLADLQ 1020 
AGKARLPQPE GCPSKLYRLM QRCWALSPKD RPSFSEIASA LGDSTVDSKP 

SEQ ID NO:291 AAB1 DNA SEQUENCE 

Nucleic Acid Accession*: NM.002205 

Coding sequence: 1-31 50 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGGGAGCC GGACGCCAGA GTCCCCTCTC CACGCCGTGC AGCTGCGCTG GGGCCCCCGG 60 

CGCCGACCCC CGCTSSTGCC GCTGCTGTTG CTGCTSSTGC CGCCGCCACC CAGGGTCGGG 120 

GGCTTCAACT TAGACGCGGA GGCCCCAGCA GTACTCTCGG GGCCCCCGGG CTOCTTCTTC 180 

GGATTCTCAG TGGAGTTTTA CCGGCCGGGA ACAGACGGGG TCAGTGTGCT GGTGGGAGCA 240 

CCCAAGGCTA ATACCAGCCA GCCAGGAGTG CTGCAGGGTG GTGCTGTCTA CCTCTGTCCT 300 

TGGGGTGCCA GCCCCACACA GTGCACCCCC ATTGAATTTG ACAGCAAAGG CTCTCGGCTC 360 

CTGGAGTCCT CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 420 

TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT TGGCATGCGC TCCACTGTAC 480 

AGCTGGCGCA CAGAGAAGGA GCCACTGAGC GACCCCGTGG GCACCTGCTA CCTCTCCACA 540 

GATAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCCGCT CAGATTTCAG CTGGGCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCCGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAGATCC TGTCTGCCAC TCAGGAGCAG 720 

ATTGCAGAAT CTTATTACCC CGAGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCCAGGCCA GTTCCATCTA TGATGACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAGTGGTG ATGACACAGA AGACTTTGTT GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

GGCTATGTCA CCATCCTTAA TGGCTCAGAC ATTCGATCCC TCTACAACTT CTCAGGGGAA 960 

CAGATGGCCT CCTACTTTGG CTATGCAGTG GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

GATGACTTGC TGGTGGGGGC ACCCCTGCTC ATGGATCGGA CCCCTGACGG GCGGCCTCAG 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCCG GCATAGAGCC CACGCCCACC 1140 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC CGATTTGGCA GCTCCTTGAC CCCCCTGGGG 1200 

GACCTGGACC AGGATGGCTA CAATGATGTG GCCATCGGGG CTCCCTTTGG TGGGGAGACC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 

CGAGGAGGCC GAGACCTGGA TGGCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GTGGACAAGG CTGTGGTATA CAGGGGCCGC CCCATCGTGT CCGCTAGTGC CTCCCTCACC 1500 

ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCCTGCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACACGTTGC TGACTCCATT 1620 

GGTTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

CTGTTCCTGG CCTCCAGGCA GGCAACCCTG ACCCAGACCC TGCTCATCCA GAATGGGGCT 1740 

CGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAACG AGTCAGAATT TCGAGACAAA 1800 

CTCTCGCCGA TTCACATCGC TCTCAACTTC TCCTTGGACC CCCAAGCCCC AGTGGACAGC 1860 

CACGGCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCC GGATAGAGGA CAAGGCTCAG 1920 

ATCTTGCTGG ACTGTGGAGA AGACAACATC TGTGTGCCTG ACCTGCAGCT GGAAGTGTTT 1980 
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GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGCCC TGAACCTCAC TTTCCATGCC 2040 

CAGAATGTGG GTGAGGGTGG CGCCTATGAG GCTGAGCTTC GGGTCACCGC CCCTCCAGAG 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCGTGA ACCAGAGCCG CCTGCTGGTG TGTGACCTGG GCAACCCCAT GAAGGCAGGA 2220 

GCCAGTCTGT GGGGTGGCCT TCGGTTTACA GTCCCTCATC TCCGGGACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAG CGACGTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTCCCAGT AAGCGACTGG CATCCCCGAG ACCAGCCTCA GAAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGAG TTACGGGACT CAACTGCACC ACCAATCACC CCATTAACCC AAAGGGCCTG 2640 

GAGTTGGATC CCGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCC AAGCCGCAGC 2700 

TCTGCTTCCT CGGGACCTCA GATCCTGAAA TGCCCGGAGG CTGAGTGTTT CAGGCTGCGC 2760 

TGTGAGCTCG GGCCCCTGCA CCAACAAGAG AGCCAAAGTC TGCAGTTGCA TTTCCGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCATTTA GCCTGCAGTG TGAGGCTGTG 2880 

TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCCCCA AAAAGAGCGT 2940 

CAGGTGGCCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 3120 
CTCAAGCCTC CAGCCACCTC TGATGCCTGA 



SEQ ID NO:292 AAB1 Protein sequence: 
Protein Accession*: NPJ02196 

1 11 21 31 41 51 

I I I I I I 

MGSRTPESPL HAVQLRWGPR RRPPLLPLLL LLLPPPFRVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEFYRPG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSFX 120 

LESSLSSSEG EEPVEYKSLQ WFGATVRAHG SSILACAPLY SWRTEKEPLS DPVGTCYLST 180 

DNFTRILEYA PCRSDFSWAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 240 

IAESYYPEYL INLVQGQLCT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAFLL MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 480 

VDKAWYRGR PIVSASASLT IFPAMFNPEE RSCSLEGNPV ACINLSFCLN ASGKHVADSI 540 

GFTVELQLDW QKQKGGVRRA LFLASROATL TQTLLIQNGA REDCREMKIY LRNESEFRDK 600 

LSPIHIALNF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLEVF 660 

GEQNHVYLGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

FAVNQSRLLV CDLGNPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NLNNSOSDW 780 

SFRLSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALEGQQLLY VTRVTGLNCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CELGPLHQQE SQSLQLHFRV WAKTFLQREH QPFSLQCEAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILA1LFGL LLLGLLIYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPPATSDA 



SEQ ID NO:293 LBH4 DMA SEQUENCE 

Nudete Add Accession #: BC001291 

Coding sequence: 44-541 (start and stop codons are underlined) 



1 11 21 31 41 51 
I I I I I I 

GGGGGCGCCG CGCGCTGACC CTCCCTGGGC ACCGCTCGGG ACGAJEQGCGC TGCTCGCCTT 60 
GCTGCTGGTC GTGGCCCTAC CGCGGGTGTG GAC AG ACGCC AACCTGACTG CGAGACAACG 120 
AGATCCAGAG GACTCCCAGC GAACGGACGA GGGTGACAAT AGAGTGTGGT GTCATGTTTG 180 
TGAGAGAGAA AACACTTTCG AGTGCX^AGAA CCCAAGGAGG TGCAAATGGA CAGAGCCATA 240 
CTGCGTTATA GCGGCCGTGA AAATATTTCC ACGTTTTTTC ATGGTTGCGA AGCAGTGCTC 300 
CXJCTGGTTGT GCAGCGATGG AGAGACXX^AA GCCAGAGGAG AAGCGGTTTC TCCTGGAAGA 360 
GCCCATGOCC TTCTTTTACC TCAAGTGTTG TAAAATTCGC TACTGCAATT TAGAGGGGCC 420 
ACCTATCAAC TCATCAGTGT TCAAAG AATA TGCTGGGAGC ATGGGTG AGA GCTGTGGTGG 480 
GCTGTGGCTG GCCATCCTCC TGCTGCTGGC CTCCATTGCA GCCGGCCTCA GCCTGTCTTQ 540 
AGCCACGGGA CTGCCACAGA CTGAGCCTTC CGGAGCATGG ACTCX3CTCCA GACCGTTGTC 600 
ACCTGTTGCA TTAAACTTGT TTTCTGTTGA TTACCTCTTG GTTTGACTTC CCAGGGTCTT 660 
GGGATGGGAG AGTGGGGATC AGGTGCAGTT GGCTCTTAAC CCTCAAGGGT TCTTTAACTC 720 
ACATTCAGAG G AAGTCCAGA TCTCCTG AGT AGTGATTTTG GTGACAAGTT TTTCTCTTTG 780 
AAATCA AACC TTGTAACTCA TTTATTGCTG ATGGCX^ACTC TTTTCCTTG A CTCCCCTCTG 840 
CCTCTGAGGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TACCACTCAT GGAGAGTATG 900 
TGCTGAGATG CTTCCGAOCT TTCAGGTGAC GCAGGAACAC TGGGGGAGTC TGAATGATTG 960 
GGGTGAAGAC ATCCCTGGAG TGAAGGACTC CTCAGCATGG GGGGCAGTGG GGCACACGTT 1020 
AGGGCTGCCC CCATTCCAGT GGTGGAGGCG CTGTGGATGG CTGCTTTTCC TCAACCTTTC 1080 
CTACCAGATT CCAGG AGGCA G AAGATAACT AATTGTGTTG AAGAAACTTA GACTTCA(XC 1 140 
ACCAGCTGGC ACAGGTGCAC AGATTCATAA ATTCCCACAC GTGTGTGTTC AACATCTGAA 1200 
ACTTAGGCCA AGTAG AGAGC ATCAGGGTAA ATGGCGTTCA TTTCTCTGTT AAGATGCAGC 1260 
CATCCATGGG GAGCTGAGAA ATCAGACTCA AAGTTCCACX^ AAAAACAAAT ACAAGGGGAC 1320 
TTCAAAAGTT CACGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 
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SEQ ID NO:294 LBH4 Protein seouence; 
Protein Accession*: AAH01291 



5 1 U 21 31 41 51 
I I I I I I 

MALLAULLVV ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFECQNPRRC 60 
KWTEPYCVIA AVKIFPRFFM VAKQCS AGCA AMERPKPEEK RFLLEEPMPF FYLKCCK1RY 120 
CNLEGPPINS SVFKEYAGSM GESCGGLWLA EJULLASIAA GLSLS 



15 

It is understood that the examples described above in no way serve to limit the 
tme scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
20 application were specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 LA method of detecting a prostate cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1-16. 

1 2. The method of claim 1, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample, 

1 4. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 8. The method of claim 1 , wherein the polynucleotide is labeled 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1, wherein the patient is undergoing a therapeutic 

2 regimen to treat prostate cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having 

2 prostate cancer. 
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1 13. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated transcript to a level of the prostate cancer- 

3 associated transcript in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated antibody in the 

6 biological sample by contacting the biological sample with a polypeptide encoded by a 

7 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

8 as shown in Tables 1-16, wherein the polypeptide specifically binds to the prostate cancer- 

9 associated antibody, thereby monitoring the efficacy of the therapy. 

1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated antibody to a level of the prostate cancer- 

3 associated antibody in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 18. The method of claim 16, wherein the patient is a human. 
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1 19. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 

9 the efficacy of the therapy. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated polypeptide to a level of the prostate cancer- 

3 associated polypeptide in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 21 . The method of claim 19, wherein the patient is a human. 

1 22. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1-16. 

1 23. The nucleic acid molecule of claim 22, which is labeled 

1 24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

1 25. An expression vector comprising the nucleic acid of claim 22. 

1 26. A host cell comprising the expression vector of claim 25. 

1 27. An isolated polypeptide which is encoded by a nucleic acid molecule 

2 having polynucleotide sequence as shown in Tables 1-16. 

1 28. An antibody that specifically binds a polypeptide of claim 27. 

1 29. The antibody of claim 28, further conjugated to an effector component. 
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1 30. The antibody of claim 29, wherein the effector component is a 

2 fluorescent label. 

1 31. The antibody of claim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical. 

1 32. The antibody of claim 29, which is an antibody fragment. 

1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a prostate cancer cell in a biological sample 

2 from a patient, the method comprising contacting the biological sample with an antibody of 

3 claim 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to prostate cancer in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide encoded by a nucleic acid comprises a sequence from Tables 1-16. 

1 38. A method for identifying a compound that modulates a prostate cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a prostate cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. 
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1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect, 

1 41". The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 

1 44. A method of inhibiting proliferation of a prostate cancer-associated 

2 cell to treat prostate cancer in a patient, the method comprising the step of administering to 

3 the subject a therapeutically effective amount of a compound identified using the method of 

4 claim 38. 

1 45. The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having prostate cancer or a 

3 cell isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of prostate cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with prostate 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 
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1 50. A method for treating a mammal having prostate cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 SLA pharmaceutical composition for treating a mammal having prostate 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 

1 52. The method according to claim 1 ,wherein said biological sample is 



2 contacted with a plurality of polynucleotides comprising a first polynucleotide that 

3 selectively hybridizes to a sequence at least 80% identical to a first sequence as shown in 

4 Tables 1-16; and a second polynucleotide that selectively hybridizes to a second sequence at 

5 least 80% identical to a second sequence as shown in Tables 1-16. 



1 53. A method according to claim 52, wherein the plurality of 

2 polynucleotides comprises a third polynucleotide that selectively hybridizes to a sequence at 

3 least 80% identical to a third sequence as shown in Tables 1-16.. 

1 54. A method of detecting a prostate cancer associated transcript, the 

2 method comprising contacting a biological sample from the patient with a plurality of 

3 polynucleotides wherein at least two of said polynucleotides selectively hybridize to a 

4 difference sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 55. A method of detecting a prostate cancer, the method comprising the 

2 steps of: 

3 (i) providing a biological sample from a patient; 

4 (ii) contacting the biological sample with a first polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a first sequence as shown in Tables 1-16 to 

6 determine the level of a prostate cancer-associated transcript in the biological sample; and 

7 with a second polynucleotide that selectively hybridizes to a second sequence at least 80% 

8 identical to a sequence not shown in Tables 1-16; wherein the expression of said second 

9 sequence is riot substantially changed in prostate cancer, to detemine the level of expression 
10 of a control transcript in the biological sample; 
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1 1 (iii) comparing the level of the prostate cancer-associated transcript to a level 

12 of the normal tissue associated transcript in the biological sample. 

1 56. A method of quantitating a prostate cancer-associated transcript in a 

2 cell from a patient, the method comprising contacting a biological sample from the patient 

3 with a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 

4 sequence as shown in Tables 1-16. 

1 57. The method of claim 56, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 58. The method of claim 56, wherein the biological sample is a tissue 

2 sample. 

1 59. The method of claim 56, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 60. The method of claim 56, wherein the nucleic acids are mRNA. 

1 61. The method of claim 59, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 62. The method of claim 56, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 63. The method of claim 56, wherein the polynucleotide is labeled. 

1 64. The method of claim 63, wherein the label is a fluorescent label. 

1 65. The method of claim 56, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 . 66. The method of claim 56, wherein the patient is undergoing a 

2 therapeutic regimen to treat metastatic prostate cancer. 

1 67. The method of claim 56, wherein the patient is suspected of having 

2 metastatic prostate cancer. 
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1 68. A biochip comprising a plurality of polynucleotides that selectively 

2 hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 69. A method of screening drug candidates comprising: 

2 i) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set forth in Tables 1-16 or fragment thereof; 

4 ii) adding a drug candidate to said cell; and 

5 iii) determining the effect of said drug candidate on the expression of said 

6 expression profile gene. 

1 70. A method according to claim 59 wherein said determining comprises 

2 comparing the level of expression in the absence of said drug candidate to the level of 

3 expression in the presence of said drug candidate. 

1 SF 1277890 vl 
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